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Abstract 

Physical  systems  can  frequently  be  modeled  by  polynomial  equations. 
Then  interesting  properties  of  the  systems  can  be  determined  from  the 
zeros  of  the  polynomials.  Standard  codes  compute  those  zeros  from  the 
coefficients  in  a stable  fashion.  But  what  should  be  done  if  the 
zeros  are  inherently  hypersensitive  to  changes  in  the  coefficients 
of  their  polynomials?  Newly  developed  methods  can  be  used  to  explain 
such  an  ill  conditioned  polynomial  by  exhibiting  a nearby  polynomial 
with  one  or  more  multiple  zeros  which  are  well  conditioned.  Further- 
more these  methods  can  be  abused  by  uncritically  replacing  the  ill 
conditioned  polynomial  with  the  well  conditioned  one  nearby.  When 
such  a replacement  Is  unwarranted,  bounds  can  be  obtained  on  the  varia- 
tion of  the  zeros  corresponding  to  the  uncertainty  in  the  coefficients. 
One  way  to  obtain  such  bounds  Is  to  exploit  the  nearby  well  condi- 
tioned polynomial  to  obtain  a revision  of  the  classical  Puiseux 
fractional  power  series  expansions  of  the  zeros. 

These  notions  have  been  Investigated  experimentally  in  a long 
series  of  computer  calculations.  In  the  course  of  these  calculations 
the  existing  stock  of  numerical  techniques  has  been  augmented.  A new 
way  is  now  known  for  computing  the  condition  numbers  which  measure  the 
condition  of  zeros.  The  previously  known  equations  to  be  solved  for 
the  nearest  polynomial  with  a single  multiple  zero  are  now  joined  by 
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equations  for  the  nearest  polynomial  with  , co^lex  conjugate  pair  of 
double  zeros  and  equations  for  the  nearest  polynomial  with  severe, 
distinct  double  zeros.  A„  these  equations  have  slap, if, -ed  foras 
decause  certain  Lagrange  multipliers  vanish  In  the  complex  case.  But 
tone  examples  demonstrate  that  when  only  real  perturbations  are  con- 
sldered,  the  Lagrange  multipliers  do  not  always  vanish.  F,„a„y,  there 

S S0"8  ab0“‘  the  ’■“««'  the  nearest  polynomial  with  a 

double  zero. 

The  numeric.,  experiments  show  that  Newton’s  rethod  ray  be  used 

the  expected  result  Is  sufficiently  ,1„,  The  tech,ques  My 

alS°  ^ aPP'ied  t0  « Wilkinson's  faTO„s  example  whose 

zeros  are  the  Integers  from  1 to  20.  But  then  the  numerical  results 

suggest  that  that  11,  conditioned  polynomial  can  not  be  explained 
successfully  as  a small  perturbation  of  a wel,  conditioned  polyn^l. 
ns  ead  Wilkinson's  polynomial  lies  In  a region  of  polynia,  space 
whose  geometry  seems  to  be  exceptionally  complicated. 

Bounds  on  uncertainties  In  zeros  corresponding  to  uncertainties 
in  coefficients  are  customarily  competed  with  Taylor  series.  For  1„ 
conditioned  si,,,  zeros  these  Taylor  series  have  radii  of  convergence 
that  are  much  too  small.  The  wel,  conditioned  am, tip, e zeros  of  a 
nearby  Polynomial  are  not  amenable  to  Taylor  series  expansions  but  may 
Be  expanded  in  , Pulseux  fractional  power  series.  These  fractional 
Power  series,  however.  „s0  have  unsatisfactory  regions  of  convergence. 

4 * Ch°M,"S  * starting  point  the  convergence  problem  of 

the  Pulseux  series  can  be  over™*  to  produce.  In  principle,  series 

that  converge  rapidly  throughout  the  region  of  Interest.  practice 
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tl.ose  series  are  used  to  produce  realistic  bounds  on  the  uncertainties 
in  the  zeros.  Full  exploitation  of  these  techniques  awaits  adequate 
< facilities  for  symbolic  algebra. 
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CHAPTER  I 


INTRODUCTION  AND  MOTIVATION 
1.  What  is  the  Problem? 

The  research  to  be  reported  In  the  following  chapters  deals  with 
"ill  condition"  of  the  zeros  of  polynomials.  "Ill  condition"  means 
unusually  great  sensitivity  of  the  zeros  to  changes  in  the  coefficients 
of  the  polynomial. 

Consider  the  following  example:  a physicist  has  determined  that 
a parameter  of  Interest  may  be  determined  by  finding  the  zeros  of  a 
polynomial.  He  computes  the  coefficients  of  the  polynomial  and  solves 
for  its  zeros  with  any  of  a number  of  computer  codes  which  find  zeros 
of  polynomials.  Then  the  computer  states  that  his  polynomial  of  degree 
six  has  the  following  zeros: 

-2.0 

-1.0 

+ .99999998  ± .000104625  1 

+2.0 

+3.0 

Perhaps  being  distrustful,  the  physicist  computes  the  coefficients  of 
the  polynomial  which  has  exactly  these  zeros.  He  finds  that  those 
reconstituted  coefficients  agree  with  the  original  coefficients  of 
the  polynomial  he  gave  the  computer  to  well  within  the  uncertainty 
( In  the  coefficients,  which  were  derived  from  experimental  data.  He 

will  usually  find  that  the  differences  between  those  sets  of  coeffl- 
clents  are  comparable  In  size  to  a few  rounding  errors,  so  he  seems 
to  have  no  grounds  for  complaint  with  the  computed  result. 
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None  the  less  there  may  be  sound  physical  reasons  why  the  answers 
he  seeks  can  not  have  imaginary  components.  Then  why  do  they  appear 
in  his  answer?  Is  he  justified  in  ignoring  them?  The  methods  pro- 
posed in  the  following  chapters  provide  a way  of  dealing  with  these 
questions. 

Those  methods  would  "explain"  the  physicist's  quandary  as  follows. 
First  they  would  show  that  the  two  complex  conjugate  zeros  are 
extremely  ill  conditioned.  That  is,  small  changes  in  the  coefficients 
comparable  with  experimental  error  could  easily  cause  them  to  un-'^rgo 
much  larger  real  or  complex  changes.  The  ill  condition  arises  from 
the  fact  that  the  physicist's  polynomial  is  very  close  to  a polynomial 
with  a double  zero.  In  fact,  the  methods  we  will  discuss  show  that 
changing  each  coefficient  of  the  polynomial  by  as  little  as  one  part 

Q 

in  10  suffices  to  cause  the  polynomial  to  have  a double  zero  at  1.0. 
That  double  zero  is  well  conditioned,  in  a sense  to  be  explained  later. 
Therefore  the  physicist  might  "ameliorate"  the  condition  of  the  ans- 
wers to  his  problem  by  accepting  a double  zero  at  1.0  in  place  of  the 

complex  conjugate  pair  if  the  experimental  uncertainties  in  the  coeffi- 

g 

cients  exceed  one  part  in  10  and  there  is  physical  justification  for 
assuming  that  his  answer  should  be  in  the  form  of  a double  zero. 

Where  that  justification  is  lacking,  the  ill  condition  of  the  result 
is  a warning  signal  that  a misjudgment  in  the  design  of  the  experiment 
and  computation  may  have  invalidated  the  results. 
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2.  What  is  111  Condition? 

We  turn  now  to  precise  definitions  of  terms  like  posedness, 
condition,  and  stability.  The  terms  have  been  defined  by  numerical 
analysts  in  many  different  and  sometimes  inconsistent  ways;  our  defi- 
nitions will  be  those  used  by  W.  Kahan  in  numerical  analysis  courses 
at  the  University  of  California,  Berkeley  [18].  These  definitions 
are  also  close  to  those  in  the  widely  used  text  by  Dahlquist  and 
Bjorck  [6]. 

The  definitions  to  follow  make  sense  if  one  thinks  of  a problem 
having  a definite  set  of  input  data  and  a similar  set  of  output  data 
which  we  call  the  solution.  For  instance,  in  the  problem  of  deter- 
mining the  n complex  zeros  of  an  n'th  degree  polynomial,  the  n + 1 
coefficients  of  the  polynomial  are  the  input  data  and  the  n zeros 
are  the  solution.  In  contrast,  the  "problem"  of  finding  a Dolynomial 
approximating  a given  function  is  incomplete  until  we  specify  a 
criterion  for  choosing  the  best  approximation.  That  criterion  could 
be  regarded  as  fixed,  and  hence  part  of  the  problem,  or  subject  to 
change,  and  hence  part  of  the  data. 

If  furthermore  the  data  are  regarded  as  uncertain,  then  the  infor 
mation  on  the  size  of  the  uncertainty  becomes  part  of  the  data.  This 
information  is  often  expressed  in  terms  of  a metric  or  norm  on  the 
space  from  which  the  input  data  are  drawn.  The  norm  itself  may  also 
be  part  of  the  input  data  if  it  is  subject  to  change.  The  purpose  of 
the  norm  on  the  input  data,  for  exar.pl e,  is  to  provide  a way  for  the 
problem  poser  to  specify  which  inputs  are  so  close  together  as  lo  be 
indistinguishable  from  his  point  of  view.  In  addition,  there  may  be 
a norm  on  the  output  solution  with  a similar  purpose.  As  we  shall 
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see,  the  poser  may  be  obliged  to  provide  these  norms  even  if  the  input 
data  are  regarded  as  exact. 

Within  this  framework  a problem  is  well  posed  if  it  (1)  has  a 
solution  which  (2)  is  unique  and  (3)  varies  continuously  when  the 
input  data  vary  continuously.  Consequently  an  ill  posed  problem  may, 
for  some  input  data,  have  several  solutions  or  none  or  the  solution 
may  change  discontinuously  when  the  input  data  is  changed  continuously. 

The  answer  to  the  question  of  whether  a problem  is  well  posed  is 
either  yes  or  no. 

Given  a problem  that  is  analytically  well  posed,  we  call  it  well 
conditioned  if  changes  that  we  consider  negligible  in  the  input  data 
can  only  cause  changes  in  the  solution  that  we  also  consider  negligi- 
ble. Conditioning  can  be  measured  by  computing  the  partial  derivatives 
of  the  solution  with  respect  to  changes  in  the  input  data.  If  the 
appropriate  norm  of  these  pari  derivatives,  called  the  condition 
number,  is  too  large,  the  problem  is  ill  conditioned.  Unlike  posed- 
ness,  then,  there  is  not  a sharp  break  between  well  and  ill  condi- 
tioned problems,  but  rather  a continuum. 

From  our  point  of  view,  stability  is  a property  of  algorithms, 
rather  than  problems,  and  relates  to  the  question,  "Does  this 
algorithm  always  produce  a solution  as  good  as  can  be  expected,  con- 
sidering the  condition  of  the  problem?"  In  aresting  numerical 
algorithms  almost  always  fail  to  produce  the  mathematically  correct 
solution  to  a problem.  This  is  because  such  algorithms  usually  commit 
rounding  errors  due  to  finite  precision  arithmetic  and  truncation 
errors  due  to  terminating  infinite  analytical  processes  after  a finite 


number  of  steps. 
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A stable  algorithm  has  the  property  that  the  uncertainty  it  con- 
tributes to  the  solution  of  a problem  is  not  much  larger  than  the 
uncertainty  that  would  be  associated  with  small  changes  in  the  input 
data.  Figures  1.1  and  1.2  illustrate  a stable  algorithm  applied  to  an 
ill  conditioned  problem.  A stable  algorithm  applied  to  a well  condi- 
tioned problem  yields  nearly  the  correct  answer.  Many  stable 
algorithms,  moreover,  can  be  shown  to  deliver  the  exact  solution  of 
a problem  with  input  data  very  near  the  given  input  data,  even  if  that 
data  is  ill  conditioned. 

To  conclude  the  definitions,  recall  that  the  key  to  the  problem 
of  the  physicist  in  section  1 was  to  find  the  polynomial  with  a double 
zero  nearest  his  polynomial.  In  general,  the  polynomials  with  one  or 
more  multiple  zeros  form  a subset  of  the  space  of  all  polynomials. 

These  subsets  have  been  called  pejorative  manifolds  by  W.  Kahan  [17], 
because  polynomials  near  a pejorative  manifold  always  have  some  ill 
conditioned  zeros.  Since  they  are  the  only  manifolds  that  interest 
us,  we  will  use  the  term  manifold  in  subsequent  chapters  to  mean  one 
of  these  pejorative  manifolds.  Thus  the  manifold  of  n'th  degree  monic 
polynomials  with  one  m-tuple  zero  is  a surface  with  dimensionality 
n-m  + 1 in  the  space  of  all  n'th  degree  monic  polynomials. 

The  distinction  between  wrong  answers  caused  by  an  ill  conditioned 
problem  and  wrong  answers  caused  by  an  unstable  algorithm  applied  to 
a well  conditioned  problem  is  well  known  in  the  west  mostly  because 
of  the  work  of  Wilkinson  [34],  But  similar  concepts  are  also  present 
in  the  contemporaneous  work  of  the  Soviet  author  V.  Zaguskin  [37], 
Zaguskin  defines  condition  numbers  with  resoect  to  small  finite  rather 
than  infinitesimal  perturbations.  In  well  conditioned  cases  his 
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Figure  1.1.  Effect  of  ill  conditioning:  a ball  in  the  input 
space  maps  Into  a cigar-shaped  region  in  the 
solution  space. 


Figure  1.2.  A stable  algorithm  maps  the  input  point  * 
into  the  region  bounded  by  the  dotted  ball 
which  is  not  much  larger  than  the  image  of  the 
input  ball. 
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methods  give  an  Idea  of  how  much  the  zeros  of  a polynomial  may  vary 
as  the  polynomial  varies  within  Its  finite  uncertainty.  In  chapter  VII 
we  will  show  how  such  notions  may  be  applied  even  for  an  111  condi- 
tioned polynomial.  There  we  will  show  how  to  develop  the  whole  series 


Of  which  the  infinitesimal  condition  number  is 


simply  a bound  on  the 


first  term. 


..I,., mil  .m.  i , i niupHHwiwpipwui  -i,i  m. 


3 . Examples  of  Definitions 

An  example  might  help  to  clarify  the  definitions  of  the  previous 
section.  Consider  the  problem  of  finding  the  smaller  real  zero  of  the 
quadratic  polynomial 

f (x)  - x^  + 2x  + 1 - e for  { e j £0.1  . 


We  see  that  for  e * 0,  there  is  a real  double  zero;  for  e < 0 
there  are  no  real  zeros;  for  e > 0 there  are  two  distinct  real 


zeros.  Since  in  some  cases  of  the  input  there  is  no  solution  to  this 
problem,  it  is  ill  posed. 

Suppose  we  restrict  the  problem  so  0 < e < 0.1  . Now  the  pro- 
blem has  become  well  poser  but  ill  conditioned.  Consider  the  depen- 
dence of  the  zeros  of  f »*•  e: 


x+  * - 1 ± 71  , 
dX 

-j-  * ± V(2^)  . 

So  as  e -*•  0 this  condition  number  becomes  arbitrarily  large  in 
magnitude.  Any  small  error  in  the  original  data  or  In  the  computation 
may  be  magnified  by  an  arbitrarily  large  factor.  Note  how  In  this 
case,  as  In  many  others,  approaching  ill  posedness  corresponds  to 
worsening  condition.  See  Kahan  [17], 

What  are  the  pejorative  manifolds  In  the  quadratic  case?  There 
Is  just  one,  the  manifold  o * quadratics  with  double  zeros.  In  the 
space  of  quadratics 

2 

x + bx  + c , 


the  manifold  of  polynomials  with  double  zeros  is  just  the  subset  of 


ppihpihpp^ 
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polynomials  with 


b2  = 4c 


It  is  evident  that  the  previous  polynomial 

x2  + 2x  + 1 - e 

lies  rather  near  this  manifold;  that  nearness  causes  the  ill  condition 
of  its  zeros. 

Stability  may  be  illustrated  by  considering  the  problem  of  find- 
ing the  small  real  zero  x of  the  polynomial 


x‘  - 2x  + 6 , 


■>-20 


for  1 6 < 10  . The  usual  formula  yields 


x * 1 - /I  -6  . 


f 


On  most  computers  there  will  be  numbers  6 large  enough  to  be 

representable  but  small  enough  that  the  computed  value  of  1 -6  is  1. 

In  this  case  the  computed  x * 0.  For  many  purposes  this  is  unaccep- 

~ 1 

tably  far  from  the  correct  answer  which  is  x i A check  of  con- 
dition numbers  shows  that  they  are  small.  That  the  fault  lies  with 
the  algorithm  implementing  the  usual  formula,  rather  than  with  the 
problem,  can  be  seen  by  considering  another  less  well  known  but  equi- 
valent formula  for  the  zero; 

x - 6/(1  +/M)  . 


An  algorithm  implementing  this  formula  will  compute  an  approximately 
correct  answer  for  small  6 even  in  the  face  of  rounding  error. 


10 


This  should  come  as  no  surprise  since  this  polynomial  is  obviously  far 
from  the  pejorative  manifold. 


r 
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4.  What  Is  111  Condition  of  Zeros  of  Polynomials? 

The  chapters  to  come  will  discuss  methods  for  dealing  with  ill 
conditioned  zeros  of  polynomials.  In  order  to  see  why  such  methods 
might  be  useful,  we  consider  first  the  problem  of  finding  the  zeros 
of  a polynomial  from  its  coefficients.  Several  algorithms  are  now 
known  which  are  not  only  stable  in  the  sense  outlined  above,  but  also 
are  more  efficient  than  other  (unstable)  methods.  Best  known  of  these 
is  that  of  Jenkins  and  Traub  [14];  another  good  one  is  Brian  Smith's 
version  of  Laguerre's  method  [30],  FORTRAN  implementations  of  both 
these  algorithms  are  available  in  the  IMSL  library  [13].  The  sta- 
bility of  these  algorithms  may  be  shown  for  a specific  problem  by  com- 
puting the  coefficients  of  a polynomial  whose  zeros  are  exactly  the 
zeros  computed  by  the  algorithm.  Then  the  coefficients  of  the  original 
polynomial  do  not  differ  much  from  the  coefficients  of  the  polynomial 
recomputed  from  the  numerical  solution. 

But  If  we  happen  to  know  the  exact  zeros  of  the  original  polyno- 
mial, we  may  find  that  they  differ  greatly  from  the  zeros  that  were 
computed.  If  this  Is  the  case  — that  a stable  algorithm  has  produced 
results  that  are  more  than  slightly  wrong  — then  the  problem  must  be 
ill  conditioned.  In  the  previous  section  we  saw  that  the  condition 
of  zeros  of  a quadratic  polynomial  was  related  to  how  nearly  the  poly- 
nomial came  to  having  a double  zero.  It  is  a basic  fact  about  the 
zeros  of  analytic  functions  that  nearness  to  a function  with  a multiple 
zero  corresponds  to  ill  condition  of  the  zeros. 

As  a simple  example  consider  the  analytic  function 

f(x)  * (x-a)mo(x) 


where  g(x)  is  analytic  and  g{o)  t 0.  If  f(r)  is  perturbed  by 
eh(T),  h(a)  + 0,  then  the  perturbed  zeros  B satisfy 

f(B)  - eh(B)  * 0 , 
so 

c • (g(B)/h(B))(B-a)m  . 

In  chapter  VII  we  will  see  that  the  last  equation  can  be  transformed 

1 /ir 

ovnwoc  c Q _ rv  »c  % c awior  in  1 ' Tknr  »ma  m 


to  express  B-a  as  a power  series  in  e 
B which  converge  to  a as  e •+  0. 


Thus  there  are  m zeros 


Implicit  differentiation  reveals  the  dependence  of  a solution  B 


on  the  data  e: 


dB  . 1 . 1 

dE'E 


As  e ■*>  0,  B -*■  o,  g(B)  -«•  g(a),  and  h(B)  h(a).  Simultaneously 
the  condition  number  |^|-|  increases  like  l/(|e|1‘1^m)  without 
bound,  so  the  condition  of  each  B becomes  infinitely  bad. 

One  way  to  visualize  the  meaning  of  the  condition  number  is  to 
think  of  the  process  of  finding  a zero  of  a polynomial  as  a mapping 
from  the  space  of  polynomials  into  the  complex  plane.  Then  we  can 
ask  how  an  infinitesimal  neighborhood  in  polynomial  space  Is  mapped 
Into  the  complex  plane.  If  that  neighborhood  is  spherical  then  Its 
image  will  usually  look  elliptical.  In  a well  conditioned  case  the 
ellipse  is  small;  In  an  111  conditioned  case  large.  In  the  case  of  an 
infinitesimal  neighborhood  of  a polynomial  with  a multiple  zero,  the 
image  is  a large  star-shaped  region. 
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The  research  to  be  described  is  motivated  by  the  desire  to  know 
how  large  these  image  regions  may  become  for  polynomials  within  a finite 
ball.  The  condition  number  tells  how  large  the  ellipses  may  be  in  the 
infinitesimal  case;  it  can  be  used  to  bound  the  first  term  of  a power 
series.  Just  when  that  first  term  is  large,  however,  the  power  series 
turns  out  to  have  a short  radius  of  convergence.  In  fact,  if  a mani- 
fold of  polynomials  with  multiple  zeros  runs  through  the  ball,  then 
the  usual  power  series  can  not  converge  at  every  point  in  the  ball. 

But  by  exploiting  that  manifold  as  described  in  chapter  VII  we 
may  be  able  to  get,  in  principle,  a different  kind  of  series  that  con- 
verges throughout  the  ball.  The  notion  underlying  that  series  may  be 
used,  in  practice,  to  obtain  a bound  on  the  size  of  the  image  of  the 
ball. 

If  the  polynomial  from  which  we  expand  lies  on  a manifold,  the 
nature  of  series  expansions  of  its  multiple  zeros  is  different  than 
when  the  polynomial  lies  off  the  manifold.  The  series  includes  frac- 
tional powers  of  the  perturbations.  This  is  not  a severe  handicap. 
However  it  may  be  that  there  are  a,  priori  reasons  for  knowing  that 
the  only  significant  perturbations  are  those  which  are  along  the  mani- 
fold and  maintain  multiplicities.  Then  reasonable  condition  numbers 
can  be  defined  which  are  finite  with  respect  to  those  perturbations. 
Furthermore  the  expansions  used  to  bound  the  changes  in  the  zeros  take 
much  simpler  forms. 
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5.  Treating  the  Symptoms  of  111  Condition 

Large  condition  numbers  are  a warning  that  small  changes  in  the 
input  data  cause  large  changes  in  the  solution  of  a problem.  In  the 
next  section  we  consider  ways  of  identifying  the  underlying  difficulty, 
but  now  we  will  merely  treat  the  symptoms:  substantial  changes  in 
our  answers  are  being  caused  by  seemingly  insignificant  changes  in  our 
data  or  by  rounding  and  truncation  errors  In  our  algorithms. 

If  our  data  is  derived  experimentally,  we  could  try  to  perform 
more  careful  experiments  in  order  to  get  the  variation  in  our  answers 
within  acceptable  limits.  If  the  data  is  not  subject  to  empirical 
uncertainties,  then  the  errors  In  our  algorithms  are  the  cause  of  our 
symptoms.  We  may  use  Increased  precision  to  reduce  the  effect  of 
rounding  errors,  and  we  may  carry  out  more  steps  of  Infinite  processes 
to  reduce  truncation  errors.  For  polynomials,  this  would  mean  carry- 
ing out  more  steps  of  Iterative  processes  such  as  Newton's  method. 

If  the  coefficients  of  a polynomial  are  known  exactly,  then 
rational  arithmetic  may  be  used  to  determine  the  zeros  to  any  required 
accuracy.  Plnkert  [41]  discusses  such  a method.  These  methods  are 
relatively  slow  on  present  computers,  but  they  do  eliminate  111  con- 
dition as  a factor  affecting  accuracy  of  computed  zeros.  Exact  arith- 
metic methods  are  Inappropriate,  however,  when  the  coefficients  are 
not  precisely  known;  then  explicit  account  should  be  taken  of  ill 
condition. 

Changing  the  algorithm  does  not  change  the  condition  of  the  pro- 
blem, but  an  unstable  algorithm  can  aggravate  our  symptoms  of  ill 
condition.  Sometimes  we  can  reformulate  the  problem  to  take  advantage 
of  a stable  algorithm.  In  other  cases  we  can  reformulate  the  problem 
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to  make  it  better  conditioned. 

Thus  we  will  see  later  that  the  condition  of  a zero  of  a poly- 
nomial may  sometimes  be  improved  by  translating  the  polynomial  so  that 
the  zero  to  be  found  is  near  the  origin.  In  certain  cases  this  may 
be  helpful,  but  care  must  be  taken  that  the  translation  is  computed 
with  insignificant  rounding  error,  The  translation  of  the  coeffi- 
cients Is  computed  effectively  by  evaluating  the  polynomial  and  n of 
its  derivatives.  Usually  such  translations  must  be  performed  in 
higher  precision  when  ill  conditioned  zeros  are  involved,  Stewart 
[31]  shows  that  the  effect  of  such  translations,  carried  out  in  con- 
ventional fashion.  Is  comparable  to  the  effect  of  rounding  errors  in 
the  coefficients  of  the  original  polynomial.  Kahan  [18]  has  shown 
that  unconventional  algorithms  can  sometimes  do  better  than  would  be 
expected  from  [31],  but  his  algorithm  is  a fluke. 

If  one  is  concerned  with  numerical  treatment  of  a polynomial  that 
arises  experimentally,  it  may  be  that  careful  translation  Is  the  most 
reasonable  method  of  "ameliorating''  ill  condition  that  has  no  obvious 
source.  Such  translation  Is  justified  If  the  zeros  represent  a phy- 
sical quantity  whose  origin  is  arbitrary.  The  coordinates  of  a point 
on  a line,  for  instance,  are  sometimes  arbitrary,  but  not  if  something 
interesting,  such  as  a body  exerting  a central  force,  occurs  at  the 
origin. 

However  performed,  translation  amounts  to  attacking  the  problem 
of  ill  condition  piecemeal,  one  zero  at  a time,  rather  than  trying  to 
deal  with  the  overall  condition  of  the  problem.  And  the  results  of 
translation  in  no  way  "explain"  the  ill  condition. 
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6.  Explaining  111  Condition 

The  methods  to  be  presented  later  try  to  "explain"  ill  condition 
by  finding  the  nearest  polynomial  with  all  its  zeros  well  conditioned. 

That  polynomial  will  be  on  one  of  the  pejorative  manifolds  of  polyno- 
mials with  multiple  zeros.  At  the  end  of  chapter  II  we  will  see  that 
if  an  m-tuple  zero  is  sufficiently  ill  conditioned  there  must  be  a 
polynomial  with  an  m+l-tuple  zero  fairly  close  by.  So  we  may  in 
succession  try  to  find  the  nearest  polynomial  with  a double  zero,  a 
triple  zero,  two  double  zeros,  and  so  on.  We  may  count  ourselves 
successful  if  we  find  that  one  of  these  nearest  polynomials  has  all 
of  its  zeros  well  conditioned  and  yet  is  close  enough  to  our  original 
polynomial.  When  we  are  successful,  our  starting  polynomial  may  be 
explained  as  a small  perturbation  of  a polynomial  with  some  multiple 
zeros,  all  of  which  are  well  conditioned. 

The  reader  with  some  experience  may  feel  that  the  nearest  such 
polynomial  should  be  apparent  from  inspection  of  the  distribution  of 
zeros,  for  ill  conditioned  zeros  often  form  obvious  clusters.  After 
all,  an  m-tuple  zero  subjected  to  a suitably  small  perturbation  will 
usually  split  up  into  m distinct  zeros,  and  such  configurations 
should  be  easily  recognized.  However,  the  ill  conditioned  simple 
zeros  scatter  so  quickly  that  they  may  soon  lose  their  clustered 
aspect.  As  we  shall  see  later  when  we  discuss  Wilkinson's  polynomial, 
it  is  sometimes  impossible  to  guess  just  by  inspection  of  the  zeros 
what  the  nearest  polynomial  with  well  conditioned  zeros  might  be  like. 

We  may  find,  moreover,  that  no  small  perturbation  will  get  us  to 
a polynomial  with  all  zeros  well  conditioned.  Rather,  by  moving 
increasing  distances  we  may  increasingly  improve  the  condition  of  the 


zeros,  but  In  order  to  improve  the  condition  of  all  zeros  as  much  as 
we  want  it  is  necessary  to  move  much  further  than  we  want.  Wilkinson's 
4 polynomial  seems  to  be  of  this  sort;  It  is  discussed  in  chapter  X. 

There  is  no  natural  division  between  the  polynomials  which  are 
* explainable  and  those  which  are  not;  however  we  set  a somewhat  arbi- 

trary boundary  by  our  choice  of  norm  and  tolerance. 

If  we  do  find  a nearby  polynomial  with  all  of  its  zeros  well  con- 
ditioned with  respect  to  variations  that  maintain  multiplicities,  then 
we  might  say  that  moving  to  the  new  polynomial  has  amel i orated  the 
problem  of  ill  condition.  Such  a viewpoint  makes  sense  only  if  the 
new  polynomial  is  indistinguishable  from  the  original  and  it  is 
reasonable  to  hypothesize  that  the  original  problem  could  have  a built 
in  constraint  in  favor  of  multiple  zeros.  This  constraint  may  have 
existed  unrecognized  heretofore,  or  perhaps  there  was  no  convenient 
algorithmic  way  to  provide  for  it  when  finding  the  zeros  of  the  poly- 
nomial from  the  coefficients.  Such  a constraint  may  reveal  itself  in 
the  following  way:  an  experimental  system  has  the  property  that  the 

observed  parameters  always  seem  to  be  well  conditioned  functions  of 
the  controllable  parameters.  The  mathematical  model  for  the  system, 
however,  might  lack  that  well  conditioned  relation  of  output  to  input. 
Should  we  add  something  to  the  model?  We  could  add  a constraint  in 
favor  of  some  multiplicity  structure,  e.g.  one  double  zero,  that  is 
inspired  by  a feature  of  the  physical  system.  For  instance  a symmetry 
* in  the  experimental  system  might  correspond  to  a double  zero  in  the 

polynomial . 

4 

Constraints  upon  the  form  of  the  solution  should  not  be  imposed 
merely  to  obtain  a well  conditioned  solution.  Not  all  experimental 
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systems  are  well  conditioned,  and  not  all  problems  should  have  well 
conditioned  solutions.  Suppressing  annoying  numerical  properties  may^ 
be  equivalent  to  ignoring  the  most  important  and  interesting  features 
of  the  system.  It  may  be  that  the  observed  ill  condition  corresponds 
to  an  important  feature  of  the  problem  that  is  not  properly  reflected 
in  our  theory.  In  other  cases  ill  condition  may  mean  that  the  problem 
we  seek  to  solve  is  so  close  to  being  ill  posed  that  it  is  senseless 
to  try  to  solve  the  problem  in  the  presence  of  error. 

Example.  Figure  1.3  is  an  example  of  a physical  system.  It  is 
the  well  known  damped  harmonic  oscillator  discussed  in  elementary 
physics  courses;  see,  e.g..  Kibble  [20].  A mass  m may  travel  up  and 
down.  It  is  attached  through  a spring  to  the  roof;  the  other  end  is 
attached  to  a shock  absorber  (dashpot).  If  the  mass  is  moved  from 
its  rest  position  and  released  it  will  eventually  return  to  its  rest 
position,  because  of  friction  forces  in  the  dashpot.  The  goal  of  an 
engineer  might  be  to  design  the  dashpot  so  that  the  mass  will  return 
to  its  rest  position  as  quickly  as  possible  after  a perturbation.  By 
adjusting  the  dashpot,  the  mass  may  be  caused  to  return  to  its  rest 
position  as  rapidly  as  possible  without  oscillation.  The  system  is 
then  said  to  be  critically  damped.  The  engineer  may  decide  that  the 
spring  force  on  m is  -kx  for  a k > 0 which  can  be  measured  to 
perhaps  three  significant  figures.  An  investigation  of  the  friction 
forces  of  the  fluid  in  the  dashpot  might  confirm  that  the  friction 
forces  on  m can  be  approximated  by  -dx  for  a constant  d > 0, 
which  can  again  be  measured  to  a few  figures.  Finally  the  mass  itself 
can  be  measured. 


t 


4 


„■>,»»■« wm—  m. i jj*att*+*r*M** B|^;iM...jr.9t^-  ■■*  ...  ...^  • 


20 


Then  the  mathematical  model  corresponding  to  the  stated  physical 
assumptions  is  that  the  restoring  force  on  m is  -kx-dx  so 


mx  + dx  + kx  = 0 , 


and  x{0)  = xQ  and  x{0)  = vQ  are  the  initial  conditions.  The  solu- 
tions to  such  linear  ordinary  differential  equations  with  constant 

c.t 

coefficients  are  usually  linear  combinations  of  exponentials  e 
c t 

and  e ’ where  c+  and  c_  are  the  zeros  c of  the  quadratic 
polynomial 

2 

me  + dc  + k 

c+t 

If  c+  = c then  the  solutions  are  linear  combinations  of  e and 
c+t 

te  . The  quantity  to  be  minimized  is  the  maximum  time  constant  for 
the  components  of  the  solution.  The  time  constant  for  e is  defined 
to  be  -1/Re  c which  corresponds  to  the  non-oscillatory,  decaying 
part  of  the  motion  of  m.  (The  oscillatory  part  is  governed  by  Im  c.) 


max(Rec+*Re  cj 


d - /d^-4mk 


for  d > /folk  , 


-g-  for  0 £ d <_  /4mk  . 


For  d >_  0 this  is  minimized  by  letting  = 4mk.  In  that  case 


c+  * c_. 


Given  m and  k the  engineer  can  compute  an  optimal  d which 
he  can  obtain  approximately  by  adjusting  the  dashpot. 

The  engineer  may  then  mass  produce  these  assemblies.  Of  course 
there  will  be  variations  within  tolerances  in  m,  k,  and  d.  Some  of 


the  assemblies  will  probably  exhibit  oscillatory  motions  when  perturbed 
Then  the  question  will  arise:  are  these  variations  from  unit  to  unit 
due  to  the  normal  variation  of  consents  within  tolerances,  or  is 
there  an  error  in  the  design,  or  in  the  claimed  tolerances? 

we  can  resolve  this  question  by  asking:  given  the  polynomial 
corresponding  to  one  of  the  production  units. 

P(0  = c2  ♦ (i)c  - (i)  . 

is  the  nearest  polynomial  with  a double  zero  within  the  distance 

allowed  by  the  tolerances  on  (4)  and 

d V ana  V?  If  Ad  ls  the  tolerance 

on  (-)  and  A the  tolerance  on  (-)  then  we  miaht  mo, 

* vm/  Luen  we  might  measure  pertur- 

bations 


by 


q(c)  = ac  + 6 


,q|2  * + (7 r)2  • 


Then  if  the  distance  to  the  nearest  polynmial  with  a double  zero  were 

less  than  A in  this  norm,  the  components  would  likely  be  within 
tolerance. 

Suppose  we  have  adjusted  the  assanbly  to  be  critically  damped. 

Then  we  may  carefully  measure  m,  k.  and  d.  If  we  wanted  to  compute 

the  time  constant  from  the  data  and  the  model,  we  would  be  wise  to 

incorporate  a constraint  in  favor  of  double  teros  in  our  polynia, 

solver,  for  that  constraint  corresponds  to  a fact  we  know  about  the 
physical  system. 

In  contrast,  if  we  carefully  measured  m.  k,  and  d on  an 
(unadjusted)  assembly  from  the  production  line,  and  we  wished  to 
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compute  the  time  constant,  it  would  be  folly  to  Incorporate  a con- 
straint for  a double  zero  in  the  polynomial  solver.  If  we  did  we 
would  always  think  that  the  assembly  was  critically  damped. 

Even  when  the  assembly  is  at  or  near  critical  damping,  where 
small  changes  in  m,  d,  or  k produce  large  changes  in  c+  or  c , 
such  small  changes  produce  only  small  changes  in  the  solution  of  the 
differential  equation,  measured  in  an  appropriate  norm.  That  is,  an 
important  feature  of  the  physical  system  is  well  conditioned.  We 
encounter  ill  conditioning  numerically  because  we  choose  to  think  of 
the  solution  of  the  equation  as  a sum  of  exponentials.  As  a conse- 
quence of  this  point  of  view  we  then  solve  a polynomial  equation  to 
find  the  time  constants  of  the  exponentials.  Solving  the  polynomial 
equation  is  the  step  that  may  be  ill  conditioned. 

Similar  mechanical  problems  are  used  as  examples  In  the  text  of 
Carnahan,  Luther,  and  Wilkes  [4,  exercises  4.23-4.26  and  example  3.1]. 

There  the  natural  circular  vibrational  frequencies  of  mechanical 
systems  with  several  components  are  computed.  These  frequencies  are 
obtained  from  eigenvalues  of  symmetric  matrices.  ’ iltiple  eigenvalues 
merely  mean  that  two  different  modes  of  circular  vibration  happen  to 
have  the  same  frequency  because  of  chance  or  some  physical  symmetry. 

Viewed  as  an  eigenvalue  problem,  eigenvalues  of  symmetric  matrices  are 
always  well  conditioned  [5].  An  Inappropriate  reformulation  of  an 
eigenvalue  problem  as  a polynomial  problem  is  responsible  for  the  111 
conditioned  zeros  Carnahan  et  al_  obtain  In  some  of  the  numerical 
results  given  in  their  example  3.1. 


7.  What  Do  Me  Do  With  the  Explanation? 

Once  the  nearest  polynomial  has  been  found  which  "explains"  some 
ill  conditioned  problem,  what  should  be  done  next? 

If  we  just  substitute  the  zeros  of  the  ameliorated  or  regularized 
polynomial  for  the  zeros  of  the  original  polynomial,  we  may  be  guilty 
of  covering  up  important  features  of  the  problem, 

One  way  to  investigate  those  features  is  to  answer  the  following 
question:  How  do  the  zeros  of  the  polynomial  vary  when  the  coeffi- 
cients of  the  polynomial  vary  within  their  respective  uncertainties? 
When  all  zeros  are  well  conditioned  this  question  is  easily  answered 
by  expressing  changes  in  the  zeros  as  a Taylor  series  in  changes  in 
the  polynomial,  of  which  only  the  first  term  or  two  are  needed  because 
the  series  converges  quickly. 

In  the  interesting  case,  however,  we  find  that  a conventional 
Taylor  series  approach  will  not  work  for  ill  conditioned  zeros.  The 
radius  of  convergence  of  the  series  never  exceeds  the  distance  to  the 
nearest  polynomial  with  a multiple  zero.  If  we  actually  move  to  that 
nearest  polynomial,  we  then  find  that  conventional  fractional  power 
series  expansion  methods  still  tend  to  founder  because  of  short  radii 
of  convergence. 

In  chapter  VII  these  problems  are  discussed  and  a method  is  pro- 
posed for  obtaining  expansions  for  changes  in  zeros  that  converge  in 
a much  larger  region  than  conventional  techniques.  The  proposed 
method  depends  on  using  the  nearest  well  conditioned  polynomial  as  a 
starting  point  for  an  expansion  in  two  phases.  The  first  phase  retains 
the  multiplicity  structure  of  the  starting  point  while  the  second 
phase  continues  in  a conventional  manner.  Thus  the  symbolic 
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determination  of  a series  expansion  depends  on  numerical  means  for 
determining  the  most  suitable  starting  point.  Most  of  the  difficulty 
of  the  problem  is  in  the  numerical  part.  Analytical  difficulties 
preclude  getting  the  actual  expansions,  but  the  idea  may  be  used  in  a 
very  practical  way  to  get  bounds  for  the  changes  in  the  zeros  as  the 
coefficients  vary  throughout  the  entire  region  of  interest.  Smith 

[42]  explains  how  Gerschgorin  circles  may  also  be  exploited  to  obtain 
similar  bounds. 
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8.  Survey  of  Previous  Results 

Prior  to  the  computer  era  relatively  little  attention  was  devoted 
to  the  problem  of  ill  conditioned  simple  zeros  beyond  recognizing  that 
small  perturbations  tended  to  break  up  multiple  zeros  into  ill  condi- 
tioned simple  zeros.  Thus  the  multiple  zeros  themselves  were  usually 
unfairly  considered  to  be  ill  conditioned.  The  behavior  of  multiple 
zeros  under  perturbation  has  long  been  a matter  of  interest  to  analysts 
and  algebraists;  the  fractional  power  series  discussed  in  chapter  VII 
have  been  known  since  the  eighteenth  century. 

Another  facet  of  multiple  zeros  is  their  effect  on  convergence  of 
zero  finding  algorithms.  It  has  long  been  known,  for  instance,  that 
the  convergence  of  Newton's  method  is  only  linear  in  the  vicinity  of  a 
multiple  zero.  Consequently  much  effort  has  been  expended  in  develop- 
ing zero  finding  iterations  that  perform  better  near  multiple  zeros. 
Such  methods  have  been  discussed  by  Traub  [33]  and  Ostrowski  [25], 
among  others;  Stewart's  is  a recent  example  of  such  work  [32]. 

James  Daniel  [7]  has  recently  studied  the  problem  of  improving 
approximations  to  multiple  zeros.  He  suggests  that  averages  of  clus- 
tered ill  conditioned  simple  zeros  may  be  taken  to  determine  the 
multiple  zero  of  which  they  are  apparently  approximations.  The  exam- 
ples he  cites  show  that  his  suggestion  may  sometimes  be  helpful  for 
double  zeros  and  perhaps  for  higher  multiplicities  if  accuracy  require- 
ments are  not  very  stringent.  Daniel's  work  has  not  been  incorporated 
in  any  widely  available  codes  for  polynomial  zeros.  The  reason  may  be 
that  a conventional  zero-finding  code  with  deflation  would,  in  the 
vicinity  of  an  m-tuple  zero,  find  first  an  ill  conditioned  member  of 
an  m-member  cluster.  Then  it  would  find  an  ill  conditioned  member  of 
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an  m-l-member  cluster  caused  by  perturbing  an  m-l-tuple  zero  which  is 
not  the  same  as  the  m-tuple  zero  of  the  original  problem.  Then  the  m 
ill  conditioned  zeros  that  are  averaged  together  at  the  end  are  not 
all  perturbations  of  the  same  multiple  zero  and  consequently  this 
average  does  not  make  a very  good  estimate  of  any  multiple  zero. 

To  J.  Wilkinson  [34]  must  go  credit  for  publicizing  the  fact  that 
ill  condition  and  apparent  clustering  are  not  equivalent  characteris- 
tics of  zeros  of  polynomials.  This  fact  does  not  seem  to  be  explicitly 
recognized  previous  to  Wilkinson's  work.  The  polynomials  he  chose  as 
examples  are  still  being  studied  profitably  as  in  chapter  X of  the 
present  work. 

Wilkinson  also  brought  to  the  attention  of  many  readers  the  facts 
that  condition  could  not  only  be  rigorously  defined  but  could  be  mea- 
sured as  well. 

In  1975  Dunaway  [8]  proposed  a different  method  for  dealing  with 
polynomials  with  multiple  zeros.  Her  work  is  based  on  the  fact  that 
the  greatest  common  divisor  (GCD)  of  such  a polynomial  and  its  deri- 
vative is  a polynomial  whose  factors  are  the  multiple  zeros  of  the 
original  polynomial,  but  of  multiplicity  one  less.  GCD  algorithms 
have  long  been  used  for  studying  polynomials  whose  coefficients  are 
exactly  known.  Recent  work  by  Collins  [5]  and  others  has  been  In  the 
context  of  symbolic  algebra  systems  employing  exact  rational  arithmetic. 

Dunaway's  Idea  was  to  Implement  a traditional  GCD  algorithm  In 
standard  finite  precision  floating  point  arithmetic.  There  the  key 
problem  1$  determining  when  a term  in  a polynomial  remainder  sequence 
may  be  considered  to  vanish,  Indicating  that  an  approximate  GCD  has 
been  found.  As  Dunaway  remarks,  that  is  a difficult  problem  in  finite 


"’■nsssjEKEaw rtrrr'tfHrK:: 


wmm stpppip 


i l mmmrnmrnmmmmmmmm - wmmmm  mm  ^ 


27 


precision  arithmetic.  She  does  not  give  details  as  to  how  she  resolved 
it,  and  it  is  not  clear  that  her  procedure  could  be  automated.  If 
that  were  possible,  it  might  be  an  attractive  method  for  investigating 
the  multiplicity  structure  of  the  zeros  of  polynomials  without  speci- 
fying that  structure  in  advance.  In  contrast,  the  methods  to  be  pre- 
sented in  subsequent  chapters  require  that  one  specific  structure  be 
investigated  at  a time  --  one  double  zero,  a triple  zero,  two  double 
zeros,  etc. 

The  present  investigation  is  based  on  the  work  of  W.  Kahan 
described  in  [17].  Kahan  displayed  the  connection  between  ill  condi- 
tion and  nearness  to  the  manifold  of  polynomials  with  multiple  zeros. 

In  [17]  and  also  in  [19]  he  determined  how  to  compute  condition  numbers 
and  how  to  derive  the  equations  to  be  solved  for  the  nearest  polynomial 
with  a double  or  triple  zero.  He  also  perceived  that  the  manifolds 
could  be  exploited  to  provide  a better  way  to  express  perturbed  zeros 
as  an  expansion  in  terms  of  the  perturbation. 

Kahan  went  as  far  as  theory  unaided  by  extensive  computational 
experience  could  be  expected  to  go;  this  dissertation  supplies  some  of 
that  computational  experience  and  some  of  the  theoretical  extensions 
motivated  by  that  experience. 
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9.  Summary  of  Findings 

The  principal  original  results  of  this  research  are: 

1)  A new  method  for  computing  condition  numbers  for  zeros  of 
polynomials,  valid  for  certain  norms  only,  is  presented  in  chapter  II. 

2)  The  equations  to  be  solved  for  the  nearest  polynomial  with 
two  complex  conjugate  double  zeros,  two  double  zeros,  and  three  or 
more  double  zeros  are  presented  in  chapters  IV  and  V. 

3)  When  k complex  multiple  zeros  are  sought,  the  equations 
that  need  to  be  solved  are  less  complicated  than  might  have  been 
thought  at  first.  It  is  shown  that  k Lagrange  multipliers  may  be 
assumed  to  vanish  for  any  interesting  solutions.  This  result,  pre- 
viously known  [19]  for  the  case  of  a single  multiple  zero,  has  been 
extended  to  the  case  of  several  multiple  zeros  and  the  case  of  a com- 
plex conjugate  pair  of  multiple  zeros  In  chapters  IV  and  V.  But  a 
counterexample  has  been  discovered  which  indicates  that,  in  the  most 
common  case  of  a real  polynomial  subject  only  to  real  perturbations, 
these  results  are  not  always  applicable. 

4)  Some  results  on  the  location  of  the  nearest  polynomial  with 
a double  zero  are  given  In  chapter  VI. 

5)  The  details  of  a new  technique  for  bounding  changes  In  the 
zeros  of  a polynomial  are  presented  In  chapter  VII.  This  technique, 
originally  suggested  by  W.  Kahan,  exploits  nearby  manifolds  of  poly- 
nomials with  multiple  zeros  whereas  conventional  techniques  are 
usually  hindered  by  the  presence  of  those  same  manifolds. 

6)  Extensive  computer  codes  of  methods  presented  In  earlier 


chapters  were  prepared  to  test  the  theory  experimentally.  In  chapter 
IX  examples  are  given  of  successful  application  of  these  codes. 
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7)  Extensive  computer  results  are  given  in  chapter  X to  support 
the  conclusion  that  one  polynomial  mentioned  by  Wilkinson  [34]  is 
intrinsically  not  amenable  to  treatment  of  the  type  proposed  in  the 
previous  section,  due  to  its  position  near  a particularly  complicated 
part  of  the  manifold  of  polynomials  with  double  zeros. 
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10.  Notation 

In  the  following  chapters  we  will  consider  perturbations  of  monic 
algebraic  polynomials  p,  of  degree  n,  with  real  or  complex  coeffi- 
cients: 

P(T ) = Tn  + l p.Tn"J  . 

j=*  J 

We  will  usually  follow  the  conventions  of  using  lower  case  Greek 
letters  for  scalars,  lower  case  Romen  letters  other  than  i through 
n for  vectors  and  polynomials,  and  capital  Roman  letters  for  matrices, 
non-linear  operators  on  vectors,  and  sometimes  for  functions.  But  p. 

J 

and  A^.  will  usually  represent  scalar  elements  of  p and  A.  Rn 
and  Cn  represent  the  real  and  complex  vector  spaces  of  dimension  n. 

The  perturbations  will  be  polynomials  of  degree  at  most  n-1, 
not  usually  monic: 

p(t)  * I V0""*  ' 
j=l  3 

We  identify  the  space  of  perturbations  q of  a polynomial  p with  a 
vector  space  of  dimension  n and,  in  the  obvious  basis 

{Tn-l  n-2  1} 


the  elements  of  the  vectors  are  the  coefficients  of  the  polynomials: 


q = 


«1 

<2 


p(t)  = l 
j-1  J 


Any  norm  for  Rn  or  Cn  may  now  be  imposed.  We  will  be  interested 
in  a weighted  ^ norm  on  defined  by 
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lqlH  = (q*Wq)1/2  * IW1/2pIe  , 

where  q*  denotes  conjugate  transpose  and  W is  Hermitian  positive 
definite  and  usually  diagonal  as  well.  In  the  diagonal  case  we  write 

iqiw  = 

There  is  a dual  space  of  functionals  u*  which  has  the  usual 

norm 
or 


* 

♦ 


in  the  diagonal  case.  Most  often  the  functional  we  are  interested  in 
is  e^*,  the  functional  that  evaluates  a polynomial  q at  C: 

* q(c).  In  our  basis  e^*  = (cn"V"2* • • £ 1). 

One  frequently  used  operator  is  the  derivative  operator  D which 
maps  Cn  to  Cn  and  has  the  matrix  form 


n 


L 

We  can  for  Instance  write  e?*D  for  the  functional  which  evaluates 
the  k'th  derivative  of  a polynomial  at  5.  In  fact  we  will  often  be 
Interested  in  the  operator  which  computes  a polynomial  and  Its  first 
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m-1  derivatives  at  z,.  We  will  define  it  as 


A.  = 


’c 


e *D 


e^*D 


m-1 


m 


so 


A?q  * 


q(C) 


(?)  , 


Corresponding  operators  5 and  A can  be  defined  for  polynomials  of 
degree  n;  their  matrices  operate  on  vectors  of  dimension  n + 1. 

Then 

f p(?> 

« 

,(m-l)( 


V 


l P' 


’(C) 


A?  Is  m by  (n+1). 

It  Is  handy  to  note  here  that  the  tn  rows  of  A?  are  Independent 
for  m < n.  For  if  we  apply  A?  to  the  vector  q representing 


(t-c)*  we  find 


0 

k! 

0 


position  k+1  . 


By  letting  k run  from  0 to  m-1  we  find  that  the  rank  of  Ar 


is  Indeed  m. 


Frequently  we  will  be  using  C as  a symbol  for  a multiple  zero 
of  a nearby  polynomial  and  a will  be  a symbol  for  a zero  of  the 
original  polynomial.  We  will  write  e*  for  e?*  and  A for  A . 
In  chapters  II  and  VII,  however,  A will  be  an  m-1  by  n matrix  ? 


Those  chapters  also  use  the  n by  n-m+1  matrix 


(m-1 )(-a)  •. 

• i 

(-a)""1 

0 


0 

1 

(m-1) (-a) 

(-of-'  . 


Multiplying  an  n-m+1  vector  , by  corresponds  to  multiplying 

a polynomial  of  degree  n-m,  q(r),  by  (T-of.  The  columns  of 
Pm-1  are  ''"early  independent  since  (T-ct)rn'1q(T)  f o if  q f 0 

When  presenting  numerical  results  we  will  often  use  FORTRAN 
E-format,  e.g. 


.123E-5  means  .123xio"5, 


CHAPTER  II 


COMPUTING  CONDITION  NUMBERS  FOR  ZEROS  OF  POLYNOMIALS 

1.  Definition  of  Condition  Numbers  for  Simple  Zeros 

In  this  chapter  we  explain  several  ways  to  compute  condition 
numbers  for  zeros  of  polynomials.  In  the  last  section  we  see  why  ill 
condition  is  always  associated  with  nearness  to  a polynomial  with  one 
or  more  double  zeros. 

Condition  numbers  are  intended  to  be  a numerical  measurement  of 
condition.  They  tell  us  how  large  a change  in  the  solution  may  result 
from  a given  change  in  the  data.  In  general,  for  a problem  which  con- 
verts m input  data  items  d.  into  n components  of  a solution  s., 

1 3s,  J 

there  could  be  nm  condition  numbers  r-jj  s ad  ’ and 

the  condition  of  the  problem  could  be  defined  to  be  a norm  of  the 
matrix  of  I\.j.  If  there  is  a norm  defined  on  the  solution  and 

a norm  defined  on  the  data,  then  the  most  suitable  norm  for  r, 

the  matrix  of  r^.  Is 

in  , sijp^)  . 

One  could  just  as  well  consider  relative  condition  numbers, 


fu 


3sj 


as  long  as  Sj  t 0. 

For  our  purposes  we  will  generally  consider  a separate  condition 
number  for  each  zero  of  a polynomial  but  we  will  lump  together  changes 
in  the  coefficients  and  measure  the  combined  change  by  means  of  a norm. 
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Let  p be  a monic  polynomial  of  degree  n, 


p(t)  = Tn  + 


t 


and  let  6p  be  a perturbing  polynomial  of  degree  n-1,  not  neces- 
sarily monic,  representing  a change  in  the  coefficients; 


6p(x)  = l 5p,Tn'^  . 
j=l  J 


Let  a be  a zero  of  p(x)  and  a + 6a  a zero  of  p(t)+6p(t). 


Definition.  The  (absolute)  condition  number,  y,  of  a with 
respect  to  changes  6p  is 


(1.1) 


s lim 

A-0 


cun 

6p  with],6pl 
I6pl  * AJ 


As  we  have  seen,  this  limit  is  infinite  for  multiple  zeros  a,  a 
defect  which  we  shall  remedy  shortly. 


There  is  one  aspect  of  ill  condition  of  zeros  of  polynomials 
that  may  surprise  those  accustomed  to  thinking  of  ill  condition  pri- 
marily in  terms  of  systems  of  linear  equations.  In  that  context  norms 
are  usually  chosen  in  such  a way  that  the  condition  number  of  a matrix 
with  respect  to  inversion  is  never  less  than  1.  There  is  no  such 
natural  choice  of  norms  for  zeros  of  polynomials  and  their  condition 
numbers  may  take  on  any  positive  value.  We  shall  see  in  chapters  IX 
and  X that  well  conditioned  zeros  can  be  very  well  conditioned  indeed: 
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in  a certain  reasonable  norm,  the  condition  number  of  one  of  the  zeros 
of  Wilkinson's  polynomial  is  about  l.E-16. 

Our  definition  of  condition  and  condition  number  is  similar  to 
that  of  Wilkinson  [34],  and  is  also  a special  case  of  a more  general 
formulation  proposed  by  Rice  [27].  Both  Rice  and  Wilkinson  also  pro- 
pose relative  condition  numbers  which  we  would  define  as 

v = JLr 

Yrel  Jaf 

for  a / 0.  In  this  case  we  would  choose  a norm  for  6p  which  would 
measure  relative  changes  in  the  coefficients.  An  example  is 


G6p9  = 


n 

I 

lJ=l 


6p, ,2^1/2 


if  all  p.  f 0.  Other  norms  can  be  devised  suitable  for  the  case  when 

J 

some  p.  is  zero.  It  is  the  responsibility  of  the  definer  of  a pro- 

J 

blem  to  decide  the  appropriate  norm.  For  instance,  if  none  of  the 
zeros  of  p are  0,  then  the  polynomial  p(x),  whose  positive  zeros 
are  the  moduli  of  the  zeros  of  p,  may  be  used  to  define  a norm: 


r n 

5Pj 

2-v 

I6pl  * 

l 

PJ 

- 

1/2 


None  of  the 


are  0 as  long  as  pn  f 0. 
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2.  Definition  of  Condition  Numbers  for  Multiple  Zeros 

The  previous  discussion  shows  that  our  definition  of  condition 
number  does  not  make  sense  for  a multiple  zero,  which  would  apparently 
have  an  infinite  condition  number.  That  infinite  condition  is  caused 
by  the  fact  that  most  arbitrary  infinitesimal  perturbations  applied  to 
a polynomial  with  a multiple  zero  tend  to  break  up  that  multiple  zero 
into  ill  conditioned  simple  zeros. 

In  order  to  have  a sensible  definition  of  condition  number 
for  a multiple  zero  we  must  only  allow  perturbations  which  do  not 
destroy  the  multiple  zero.  Here  is  an  example:  consider  a real  monic 
cubic  polynomial, 

P(t)  * (t-oi)2(t-8)  s - (2o+B)tZ  + (2aB+a2)T  - a2B  , 
and  small  quadratic  perturbations, 

2 

q(r)  * q^T  +q2T  + q3  , 

which  preserve  the  multiplicity  of  a so  that 

p(t)+q(T)  = (t  - (o+e))2(t  - (B+6))  . 

We  discover  that 

q1  * 2e  + e 

q2  * 2<se  + 2Be  + 2a0  + (2e8  + e2)  , 

q3  x 2<sBe  + a20  + (2oe8  + Be2  + e20)  , 

where  the  parentheses  segregate  higher  order  terms  which  we  shall 
ignore.  Thus  the  three  parameters  q^  are  defined  in  terms  of  the 
two  variables  e and  8.  We  can  choose  any  two  of  the  q.  as  the 


independent  parameters  of  the  perturbation  and  solve  for  e 1n  terms 
of  them.  Thus  If  we  choose  and  q2,  we  find 

e * (q2-aq1)/(2(B-a )) 

and 

e = ($q1-q2)/($-a) 
to  first  order  in  e and  6. 

Then  we  can  see  that  the  ratio  of  change  In  solution  (e)  to  change 
in  data  (q^  is 

_e_  _ Vql  - a 

q]  " “5TB-a) 

which  will  be  well  defined  unless  6 - a,  which  would  mean  that  the 

multiplicity  of  o was  not  two,  as  we  thought,  but  actually  three. 

In  general  let 


P(t)  * (T-a)mq(T)  , q(a)  t 0 . 


Definition.  The  condition  number  of  a is 


(2.1) 


y = 11m 
A-0 


sup 

over  5p  maintaining' 
multiplicity  of  a 
with  I6pl  ■ a 


In  order  to  appreciate  graphically  what  Is  meant  by  constraining 
perturbations  to  maintain  multiplicity,  consider  the  drawings  In 
Figures  H.1-II.3  of  the  space  of  monlc  real  cubic  polynomials.  That 
space  Is  three  dimensional  and  the  set  of  small  perturbations  about  a 
point  In  that  space  Is  a closed  ball.  The  drawings  are  based  on  a 
norm  In  which  closed  balls  look  like  spheres!  see  Figure  II. 1. 
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The  set  of  monic  real  cubic  polynomials  with  double  zeros  is  a 
two  dimensional  algebraic  surface  (manifold).  The  set  of  small  per- 
turbations maintaining  multiplicity  of  a double  zero  is  the  intersec- 
tion of  the  ball  and  that  manifold.  If  the  manifold  were  a plane  that 
set  might  be  an  oval.  In  general  that  set  resembles  a bent  coin  or 
an  ellipse  warped  into  three  dimensions;  see  Figure  I I. 2. 

The  double  zero  is  well  behaved  in  the  face  of  perturbations  that 
keep  the  polynomial  on  the  manifold  but  away  from  the  one  dimensional 
submanifold  of  real  cubic  polynomials  with  a triple  zero.  That  sub- 
manifold is  an  algebraic  curve  and  a subset  of  the  surface  mentioned 
previously.  The  set  of  small  perturbations  maintaining  a triple  zero 
is  the  intersection  of  the  ball  and  that  curve  --  amounting  to  a seg- 
ment of  the  curve,  as  in  Figure  II. 3. 


Figure  II. 1.  A small  ball  about  p in  containing  pertur- 
bations 6p  such  that  l£p|  < a. 


surface  of  polynomials 
with  double  zeros 


Figure  II. 2,  The  set  of  small  perturbations  about  p 
maintaining  a double  zero  resembles  a 
bent  coin. 


curve 


Figure  II. 3.  The  set  of  small  perturbations  about  p 
maintaining  a triple  zero  is  a segment 
of  a curve. 
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3.  Condition  Numbers  for  n-tuple  Zeros 

As  a start  we  derive  a condition  number  for  the  simplest  case, 

s 
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4.  Resolution  of  Condition  Number  into  Components 

We  show  now  that  the  condition  number  we  have  defined  is  a product 
of  two  independent  factors.  Thus  for  the  polynomial 

P(t)  = (T-a)m  n (t-C.) 

j=m+l  J 

the  condition  number  for  a will  be  shown  to  be 


' n 

n |a-U 

j=m+l  J 

where  the  numerator  a/m  will  depend  on  the  zero  a but  not  on  the 

other  zeros  The  denominator  depends  on  the  other  zeros  £.  but 
J J 

not  on  m nor  on  the  norm.  We  require  that  a f c.  so  that  m is 

J 

indeed  the  true  multiplicity  of  a. 

W.  Kahan  demonstrated  this  fact  in  [17]  after  showing  that,  for  a 
monic  polynomial  of  degree  n,  an  m-tuple  zero  may  be  regarded  as  an 
analytic  function  of  the  first  n + 1 - m coefficients  of  that  poly- 
nomial. This  may  be  compared  to  the  well  known  result  that  a simple 
zero  is  an  analytic  function  of  the  n coefficients  of  a monic  poly- 
nomial. In  both  cases  analyticity  is  confined  to  regions  in  which  the 
zero  does  not  increase  or  decrease  in  multiplicity. 

We  shall  infer  the  resolution  of  the  condition  number  directly, 
however.  Let 

P(t)  = (:-Omq(') 

and  let  5p  represent  infinitesimal  variations  in  p such  that 
p + 5p  has  a multiple  zero  ri+>^  of  multipl  icity  m.  Then 
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(p+6p)(x)  = (x  - (a+5a))m* (q+6q)(x)  , 

and  in  consequence,  keeping  only  first  order  terms,  we  find 

6p(x)  = (x-a)m"^{(T-a)6q(x) -mq(x)6a}  . 

Thus  6p  is  displayed  as  a function  of  6q  and  6a. 

We  claim 

T . sup  M * 1 r-rrr  sup  lltyl--  - , 

[constrained!  p*  r ‘ [r  of  degree!  |(T-a)m-1*r(T)| 

l 5p  J ( £ n-m  J 

and  we  prove  it  by  showing  the  one-to-one  correspondence  between  such 
r and  such  5p.  Namely  let 

r(x)  - (x-a)6q(x)  - mq(x)6a 
so 

5p(x)  = (T-o)m'1r(T)  . 

Since  6p  has  degree  < n-1,  r has  degree  n-m.  The  dimension  of 
the  vector  r is  n-m+1,  however,  since  the  polynomial  r(x)  is  not 
monic. 

Any  such  r defines  5p  and  hence  5q  uniquely: 

^(x)  • siiliafilfe . 

The  numerator  of  the  expression  for  6q(x)  does  vanish  when  x * a 
so  that  expression  is  indeed  a polynomial  rather  than  a '-ational  func- 
tion. Therefore  we  may  write 

! 6a | _ 1 ! r (a) | 
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n n 

and,  since  q(-r)  = n (t-£.),  then  |q(a)|  = n |a-c.|. 

j=m+1  J j=m+l  J 

As  claimed,  then,  we  may  write  the  condition  number  for  a as 


(4.1) 


and 


n la-5, | 

j=m+l  J 


(4.2) 


l 

m 


sup 

'degree  r) 

l < n-m  J 


r(a) 

|(T-ot)m'  r(x)l 


is  the  part  of  the  condition  number  that  is  independent  of  the  other 

zeros  The  next  few  sections  will  be  devoted  to  explaining  how  to 

J 

compute  a. 
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5.  Computing  o for  Arbitrary  Norms  — Dual  Method 


W.  Kahan  [19]  has  provided  the  following  method  for  computing  a 
in  arbitrary  norms.  We  shall  see  that  it  leads  to  solving  a standard 
kind  of  linear  approximation  problem,  namely 

cr  - min  Is*  + Jt*A|/(m-l ) ! , 

a* 


for  vectors  s*  and  :*  and  a matrix  A to  be  defined. 

To  prove  the  statement  above,  write  the  formula  for  a as 


sup 


* i 

e r 

ot  1 

IIP.  -.H 


fr  of  degree]"  m-V 
< n-m 


sup 

y of  degree' 
< n-1 


I e*ZSy | 

"Vi5* 


where  y e Cn  and  S is  a map  from  Cn  onto  Cn_m+\  Z is  the 
operator  which  fills  out  n-m+1- vectors  with  zeros  to  form  n-vectors: 


0 

1 

1 

0 

• 

0 

1 J 

d 

n-m+1 


ZS  is  required  to  be  a projector.  Finally  9 ^ is  the  linear 
operator  from  cn‘m+1  to  Cn  mentioned  in  chapter  I which  represents 
multiplication  by  (t — . 

Our  goal  is  to  transform  the  sup  problem  into  a dual  min  problem. 
We  therefore  state  a duality  theorem  of  Buck  [3],  The  setting  for  the 
theorem  is  a normed  vector  space  E with  its  dual  space  of  functional 
E*.  If  M is  a subspace  in  E and  W1-  its  annihilator  in  E*, 


the  theorem  states 


sup 
x eM 


min  lvQ*-v*l  . 
v*  e AP- 


For  the  application  at  hand,  E is  Cn.  M = (Pm  ^Sy|yeCn}.  Then 

= { v* ! v*F  ,S  = 0}.  We  discover 
1 m- 1 


sup 

yeC1 


I w *P 
|V0  Km-1 


Syj 


n BPm-lSy®  (u*P  ,S  = v*P  ,S) 
v m- 1 0 m-1  J 


Bu*n  . 


Then  if  there  is  a vn*  such  that  v *P  ,S  = e *ZS  we  will  have  the 

U 0 m- l a 

sup  we  seek,  expressed  as  a min. 

Since  the  columns  of  Pm  ^ are  linearly  independent,  the  range 

space  of  .j*  must  have  full  dimension  so  the  equation 

Z*e  = P ,*vn  may  be  solved  for  v_.  Therefore 
a m-1  0 0 


(5.1) 


a = min  Bu*B  . 
(“*Pm-lS-ea*2S) 


Let  us  see  what  the  solutions  of  u*P  ,S  = e *ZS  are;  amonq 

m-1  a 

Is 

them  we  will  find  that  of  minimal  norm.  As  in  chapter  I let  D 
denote  the  operator  which  maps  polynomials  to  their  k'th  derivatives. 
Then  we  find  that 

u*  = ec*Dm"1/(m-l)! 

is  one  solution  of  the  equation.  For  consider  any  y(x)  and  let 
r(i)  be  its  image;  r = Sy.  Then 

ea*Dm"Vr  = 

* (m-l)!r(a)  = (m-1 ) ! e *Zr  . 

a 
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The  next  step  is  to  determine  the  solutions  of  the  homogeneous 

equation  u*P  ,S  = 0.  The  rank  of  S is  n-m+1 , as  is  the  rank  of 

i , and  therefore  their  product.  Since  u*  has  dimension  n,  the 

null  space  of  (P  ,S)*  must  have  dimension  m-1,  Therefore  we  seek 
m- 1 

a subspace  of  solutions  u*  of  dimension  m-1. 

We  may  easily  verify  that  {eQ*,ea*D, . . . ,ea*Dm"2}  is  a set  of 

solutions  to  u*P  ,S  = 0,  because  e *DkP  ,r  = {(T-a)m_1r(T)}^(a) 
m-i  a m- i 

=0  for  0 £ k £ m-2.  These  m-1  linearly  independent  solutions 
therefore  form  a basis  for  the  solution  space  and  we  may  insert  the 
general  solution  of  the  inhomogeneous  equation  in  the  formula  (5.1) 
to  get 

(5'2)  0 * TSTfT  *in  K*D'"’1  + j0Va*Dk|  ■ 


If  we  write  the  m-1  vector  l*  = (*0»xi Xm-2^’  the  by  n 

matrix 


A 


e *D 
a 


ea*D 


m-2 


and  the  vector  s* 


e *D 
a 


,m-l 


we  have 


(5.3) 


o = min  ls*+£*AI/(m-l)!  . 
I* 


Consequently  a may  be  found  by  solving  the  indicated  linear  approxi- 
mation problem,  as  claimed. 


6 . Computing  a for  £q  Norms  --  Dual  Method 

We  now  evaluate  a for  Jig  norms.  First  we  note  that 
-1/2 

Bu*Bw  = llu*W  ' II 2 and,  using  the  theory  of  least  squares,  the 
minimal  residual  may  be  expressed  as 

o = min  |W"1/2s  + W'1/2A*J>|2/  (m-1)! 

£ 

= (s*{W-1 -W'1A*(AW"1A*)"1AW'1)s}1/2/ (m-l)!  . 

In  particular,  if  m = 1 then  A = 0 and  s = e^  so 


(6.1) 


0 * <eo*“'lea)1/2  * 


n 

I la 

lj=l 


2 1 n-j 


/Wj 


1 1/2 


If  m 
find 


2,  then  A = e^*,  s « D*ea>  and  after  some  computation  we 

2 i _ 2 2 n- i II(n-j)|a2|n-J/W.|2 

a2  =-U{I(n-j)2  a2  n J/w.-- i-}  , 


or  in  a computationally  more  economical  form, 


v _]_i  2 j n- j-1  r r 1 i 2|n-k/.  .\2-» 

l rH°  I ( l I (k-j)  ) 

2 m j=l  wj k«j»l  wk 

2,n-j 


l |adn‘J/Wi 

j«l  J 


For  m > 2, 


1 mJn  f ? 1 C n- j ) i n-j-m+1  , (n-j)!  n-j-k 

(UHTT  *in  { ^ wT  Tn^-n^  + J0Xk 


2 1/2 
} 


This  may  be  written  in  conventional  least  squares  format  as 
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7.  Computing  o for  ilp  Norms -- Primal  Method 

In  the  previous  section  we  computed  o by  solving  the  dual 
problem.  Our  goa1  now  is  to  find  c directly.  First  convert  the 
expression 


o = 


sup 

of  degre^ 
< n-m 


ir(a)l 

T-a)m_  r(t)I 


into  the  vector  notation: 


|e  *r  j 

0 = sup 

m-  I 


But  if  we  define  a new  norm  ir!L  = SP„  ,rll  then  by  definition 

r m- 1 


a - iea*Sp 

in  the  dual  norm. 

Now  Irllp  = ®^m-l*WPm-l^^r*2  1n  our  ^2  non11  S0 

(7.1) 

We  can  check  this  result  by  comparison  with  the  simplest  case,  m * 1. 
Then  Pq  = I and 

o2  . e *irV  = l |a2rJ/w, 

a a 1 j 


which  is  just  the  result  obtained  in  the  previous  section. 


8.  Computational  Details 

We  shall  see  how  to  compute  the  non-zero  elements  of  P ,*WP  . 

m-l  m-1 

Let  P denote  a generalized  matrix  of  the  P^  type  corresponding 

to  multiplication  by  a monic  polynomial  t(t)  of  degree  d.  For 

2 2 2 

instance,  if  m = 3,  P£  corresponds  to  (i-a)  = t -2ax  + a . Then 

2 

tg  * 1,  t^  = -2a,  and  = a are  the  elements  of  t.  P has  the 
form  of  an  n by  n-d  matrix 


so 


Then 


V 


^ J £ i £ J+d  , 
0 otherwise 


(P*WP)iJ  = 


k*min(i,j) 

I w.  t.  -*t.  . 
k*max(i,j)  k -1  k"J 


0 


if  | i - j | < d , 
otherwise  , 


so  this  matrix  has  bandwidth  2d  + l in  addition  to  being  positive 
definite  Hermitian. 
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9.  Condition  Numbers  for  Complex  Conjugate  Zeros  of  Real  Polynomials 
The  formulas  derived  in  the  previous  sections  were  valid  for 
complex  zeros  of  a complex  polynomial  subject  to  complex  perturbations. 
It  is  easy  to  verify  that  the  same  formulas  apply  for  real  zeros  of  a 
real  polynomial  subject  to  real  perturbations.  The  case  of  complex 
zeros  of  a real  polynomial  subject  to  real  perturbations,  however,  is 
more  complicated.  The  requirement  that  the  perturbed  polynomial 
remain  real  amounts  to  an  extra  constraint.  We  now  define  condition 
numbers  that  reflect  this  constraint.  Let 

p(x)  = (x-ct)m{x-ci)mq(x)  , q(a)  f 0 , 

represent  a real  polynomial  with  a complex  m- tuple  zero  at  a and 
consequently  at  a as  well,  with  Im  a f 0.  Considering  infinitesimal 
perturbations  we  define 

(p+6p)(x)  = (t  - (a+6a))m(x  - (ci+5a))m(q+6q)(T) 


and  to  first  order  we  find 


5p(x)  = (T-a)m‘1(T-a)r,‘1[{T-a)(T-a)6q(T)  - 2mq(T){(Re  6a)x  - Re(a5a)}] 


Definition.  The  condition  number  of  a with  respect  to  real 
perturbations  of  p is 


(9.1) 


y 


lim  sup 
£-*■0  fconstrained  6pi 
(with  1 6 p I = A J 


Let 


r(x)  = (x-a)(x-a)6q(x)  - 2mq(x) { (Re  5a)x  - Re(a6ot)}  . 


} 


A 
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Then  real  6q  and  complex  6a  define  r uniquely.  Conversely, 


and 


6a  = (y^T  r(a))/(2m(Im  a)q(a)) 

5q(T)  = r^T)  * 2mq(T){(Re  &»)?  - Re(a6a)} 
M (i-a) (i-a) 


As  before  we  can  verify  that  the  expression  for  5q  defines  a poly- 
nomial rather  than  a rational  function. 

Thus  there  is  a one-to-one  correspondence  between  r and 
(6a, 6q).  Substituting  in  (9.1)  we  find 


Y = 


2m | Im  a| |q(a) | 


sup 
jr  of  dec 
[ < n-2m 


degree  |(i-a) 
2m+l 


-T/_  -\tn-l 


(t-a)  r(t)| 


or 


(9.2) 


Y 


2m | Im  a | |q(a) | °c  ' 


Thus  in  this  case  as  well,  the  condition  number  consists  of  (1)  a numera 

tor  o/(2m|lma|)  independent  of  the  other  zeros  and 
c n J 

(2)  a denominator  | q (a) | * II  |a-?.|. 

j*2m+l  J 

The  limit  Im  a 0 corresponds  to  a and  a coalescing  to 
form  a zero  of  greater  multiplicity  2m.  Therefore  the  condition 
number  becomes  infinite  as  Im  a ■»  0. 
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10.  Compute nq  a.  for  f,„  Noms 

We  turn  now  to  the  problem  of  computing  a by  a method  similar 

V 

to  the  primal  method  for  computing  o.  Define  Cm  ^ mapping  Rn_2m+2 
into  Rn  as  the  operator  corresponding  to  multiplication  by 
(T-a)m~^ (r-a)m’^  for  complex  a.  Then  in  matrix  form,  for 
instance  is  n by  n-2: 


Consequently 


C,  = 


1 

-2  Rea 

,2 


1 

•2  Re  a 

lal2 


= sup 
r 


r*e  e *r 


lr(°I r = sup ...  'era 


As  before  C ,*WC  . is  real  symmetric  positive  definite  so 
m- I m- l 

(Cm_i*WCm_i )-1^2  exists.  We  find  that 


sup 

r 


r*r 


(C  ,‘WC  ,)*1/2r 

m- l m- 1 


The  supremum  is  over  real  r but  the  matrix  e e * is  complex  so  a 

a a t 

Rayleigh  quotient  argument  does  not  apply  directly.  Instead  write 

e * & u*  + i v*  where 
a 


and 


u*  * Re(eQ*)  * (Re(an’^)  •••  Re  a l) 
v*  * Im{eQ*)  * (imfa0”1)  •••  In  a 0)  . 


Then  observe  that  for  any  real  s, 

s*eaea*s  = s*(uu*  + w*)s  . 

Applying  the  Rayleigh  quotient  theorem  now  we  find 

ac2  = max  eigenvaluettC^^WC^  )"1/,2(uu*+vv*)(Cm_1*WCm_1)’1/2] 

■ max  eigenvalue[xx*  +yy*] 

where 

A rank  two  matrix  has  two  positive  eigenvalues  which  can  be  found 
by  reduction  to  a matrix  of  dimension  two.  For  an  eigenvalue  X and 
an  eigenvector  (6x+<py) , 

(6x  + $y)  « (xx*+yy*)(6x  + $y)  . 

Therefore 


’ x*x  x*y  ’ 

' e ' 

’ e ' 

« X 

. y*x  y*y  . 

. ♦ . 

. 4>  . 

and  X is  an  eigenvalue  of  the  indicated  two  by  two  matrix.  The 
largest  eigenvalue  of  that  matrix  is 

(10.D  Xmax  « ^(x*x  + y*y+  ((x*x-y*y)2  + 4|x*y|2)1/2) 

where 

x*x  ■ u*!Cm-rWCn,-l)u  • 

*.y  . u*(Cm.,*HCm.,)v  . 


etc.  Then 
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00.2) 


= >. 

max 


and 

(10.3) 


= ^tnax 

Tc  2m | Im  a| |q(a) | 


x 


What  does  this  result  mean  in  the  case  m * 1?  For  comparison,  suppose 
we  computed  the  condition  number  y of  the  same  complex  a using  the 
general  formula  for  complex  polynomials  (4.1,  6.1).  The  result  is 


= '4ct*W"1eQ  / (2m|Im  a|  |q(a)j)  . 


To  compute  oc  note  that  x*x  = u*W"  u,  etc.,  and 

\nax  * I<V“'lea  + il/2> 

2 2 

where  A = (x*x-y*y)  +4|x*y|  . From  the  Cauchy-Schwartz  inequality 
we  can  deduce  that 


and  consequently 


{ea*W"1eQ()2  * (x*x+y*y)2  > A > 0 , 


lie  *W_1e  ) < \ < e . 

2 a a'  - max  - a a 


Then  we  find  that 

(10.4)  1 < y/yc  £ & 

for  m * 1. 

When  m > 1,  however,  the  discrepancy  between  these  condition 
numbers  can  be  much  greater.  In  fact,  as  Im  a •*>  0 for  fixed  Re  a 
and  m > 2,  y/yc  increases  without  bound.  The  condition  numbers 
differ  because  y maintains  the  multiplicity  of  only  one  zero  intact 


but  Yc  maintains  intact  the  multiplicities  of  two  zeros. 


Computational  Details  for 

The  computation  of  oc  is  similar  to  that  of  o,  except  the 
matrix  Cm_]  corresponds  to  multiplication  by  t(r)  = (T-a)m'1(T-a) 
a polynomial  of  degree  d=2m-2.  Then  is  n by  n-2m+2,  and 


^ k=min(<i , j)+d 

) vi  t 
k=max(i,j)  ^ 

0 


|i-j|  < d , 
otherwise  . 


11.  General  Condition  Numbers 


The  first  condition  numbers  we  considered  reflected  the  condition 
of  a zero  subject  to  infinitesimal  perturbations  that  maintain  the 
multiplicity  of  (only)  that  zero.  The  second  condition  numbers 
reflected  condition  with  respect  to  perturbations  that  maintain  the 
multiplicity  of  that  zero  and  its  complex  conjugate.  We  can  go  fur- 
ther, restricting  the  class  of  allowable  perturbations  to  those  that 
maintain  whatever  multiplicity  structure  we  consider  important  in  the 
other  zeros. 

For  instance,  let 


p(t) 


K m. 
n (t-o  ) k 
'•k=l  K 


q(*0 


9 


where 

q(ak)  t 0 , 1 < k < K , 


and  we  consider  only  perturbations  of  the  form 


K m. 

(p+6p)(x)  - n (t  - (a. +5a. ) ) K(q+5q)(x) 
k=l  k k 


so  that 
6p(x) 


In  the  usual  way  define  the  condition  number  y of  a with  respect 
to  such  constrained  perturbations  to  find  that 


1 

K 

m|q(a,)!  n |a,-a.| 
k=2  k 


cijn 

r(a) 

ouu 

f deq  r < 

1 | r K m.-l-j 

b-l-I(mk-li 

1 l(k"1(T'°‘k)  ) 

(11.1)  Y= 
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In  the  case  we  can  write  the  sup  as 


£ f r^66*r \ 

°G  = sup  ^r*G*WGF 


where  6 is  the  operator  corresponding  to  multiplication  by 


m,-l  rn«"*l 

(T-a-j ) (t-cx2)  L •••(T-aK)  K 


Then  as  before,  in  the  case  of  complex  perturbations  of  a complex 
polynomial , 

aG2  = e*(G*WG)_1e 

where 

_ / n-1  n-2 

e*  - 1 ) . 

The  case  of  real  perturbations  of  a real  polynomial  with  real  is 
similar.  If  a.j  is  a complex  zero  of  a real  polynomial,  however, 
then  one  of  the  other  , and 

aG2  = ^-(x*x  + y*y+ {(x*x-y*y)2 + 4|x*y|2}1/r2)  , 

where  x*x  = u*(G*WG)  ^u,  y*y  = v*(G*WG)  ^v,  etc.,  as  in  the  previous 
section. 
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12.  Application  of  the  Idea  of  General  Condition  Number 
Let 

p(x)  = (x-a)m(T-a)mq(x) 

be  a real  polynomial  with  complex  a.  We  have  defined  yc>  the  con- 
dition of  a with  respect  to  real  changes  which  maintain  conjugate 
m-tuple  zeros  a + 6a  and  a + 6a.  We  want  to  compare  yc  to  y2> 
the  condition  of  a with  respect  to  complex  changes  that  maintain 
m-tuple  zeros  a+6a  and  ci + 66.  6a  and  66  are  no  longer  neces- 

sarily complex  conjugate. 

We  have  seen  that 

\ • 2m|  Im  a]  |q(a)  | • + + /(»«x-y*y)ii  + 4|x«yl2 

where  x*x  = u*(Cm  , *WC  , )”^u,  u*  = Re(e  *),  etc.  C , corresponds 

m- l m- I a m- l 

to  (x-a)m_^  (x-a)m~^ . 

To  compute  y2>  let 

p(x)  = (x-a)m(x-a)mq(x)  . 

Then 

To  = -r-rTT  * 7-W  /e  *(riWG)”1e 
'2  m|q(a) | |a-a|  a 'a 

^ 1 • rn  1 

where  G also  corresponds  to  (x-a)m  (x-a)  . Since  G = C , , 

m- 1 

■*2  = 2m|  Im  Mlq'WI  ’Wy*)r 
and 

1 < ^ . 

-Yc  " 


(12.1) 
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In  contrast  to  (10.4),  our  present  result  is  independent  of  m. 
It  means  that  the  restriction  to  only  real  perturbations  does  not 
affect  the  condition  number  by  a very  large  factor  compared  to  a con- 
dition number  that  allows  complex  perturbations  that  maintain  the 
multiplicities  of  the  same  number  of  complex  zeros. 


♦ 
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13.  Condition  Number  vs.  Distance  to  Submanifold 

Now  that  we  have  a definition  for  condition  number,  we  shall  show 
why  ill  condition  prompts  us  to  look  for  the  nearest  polynomial  with  a 
more  multiple  zero.  Consider  the  polynomial 

p(t)  = (T-a)mq(i)  . 


Then  the  condition  of  a is 


m 

Y " |q(«)| 


(m-l)!o 

i?5^' 


Consider  the  second  polynomial 

p(t)  • (T-a)"(q(T)-q(a))  . 


This  polynomial  has  an  m+1 -tuple  zero  a.  Further  if 

A = |p-p|  * |q(a)||(T-a)m|  , 

then 


That  is,  if  n,  m,  a,  and  the  norm  are  regarded  as  fixed,  then  ill 
condition  (large  y)  always  implies  that  there  is  a nearby  polynomial 
with  an  m+1 -tuple  zero.  Furthermore,  the  closest  such  polynomial  may 
be  much  closer  than  the  estimate  above. 

W.  Kahan  has  suggested  [17]  that  ill  condition  may  be  explained 
by  exhibiting  the  nearest  polynomial  with  a higher  order  zero.  Ir.  the 
vector  space  of  polynomials  with  m- tuple  zeros,  that  corresponds  to 
finding  the  closest  point  on  the  manifold  of  polynomials  with  m+l-tuple 


zeros.  If  that  m+l-tuple  zero  is  still  ill  conditioned,  then  there 
must  be  a nearby  polynomial  on  the  submanifold  of  polynomials  with 
m+2-tuple  zeros. 

In  the  chapters  that  follow  we  shall  describe  ways  of  finding  the 
nearest  polynomial  with  an  m-tuple  zero. 


CHAPTER  III 


FINDING  THE  NEAREST  POLYNOMIAL  WITH  AN  m-TUPLE  ZERO 
1 . Introduction 

In  the  first  chapter  we  discussed  wh£  we  might  wish  to  find  the 
nearest  polynomial  with  an  m-tuple  zero.  Now  we  will  demonstrate  how 
to  set  up  the  equations  to  be  solved.  The  problem  amounts  to  a con- 
strained optimization,  and  in  general  we  find  we  must  solve  a non- 
analytic  equation  in  a complex  variable. 

We  first  consider  the  simplest  cases  of  the  problem:  finding 
the  nearest  real  polynomial  with  an  n-tuple  zero  or  with  a double  zero. 

Then  we  discuss  the  equations  to  be  solved  for  the  stationary 
points  which  include  the  nearest  complex  polynomial  with  an  m-tuple 
zero.  Finally  we  explain  two  kinds  of  second  derivatives  which  may 
be  used  for  deciding  which  stationary  points  are  actually  minima. 
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2.  The  Nearest  Polynomial  with  an  n- tuple  Zero 

We  will  start  by  considering  the  simplest  case  — that  of  finding 
the  nearest  polynomial  with  an  n-tuple  zero.  We  suppose  that  we  have 
a monic  polynomial 

p(t)  = Tn  + l p.Tn-J' 

j=i 3 

and  we  wish  to  find  another  polynomial 


q( 


T)  . (T.dn  . Tnt^(n)(.c)V-J  . Cjl.jnSljj, 


such  that  llp-ql  is  a minimum. 
Since 


p-q 


and  depends  only  on  5 we  can  easily  find  the  equation  to  be  solved 
for  stationary  points  with  respect  to  a given  norm.  We  will  demon- 
strate the  equation  for  the  weighted  norms  as  follows: 

If  we  let  the  raised  dot  • represent  or  3"^  we  find 

(Irr)  = r*Wr+r*Wr  * 2Re(r*Wr)  . 

For  stationarity  we  require  then  Re(r*Wr)  * 0.  Thus 


0 ■ ReJ 

J *1 

■ Re  Iw,.j(?)(-c)J-1(p.-(l;)(-c)J)*c 

j.|  J J J J 


or 


(2.))  f(c)  5 Iw1-J(I)(-t*)J‘,(p<-(!)(-«)3)  ■ 0 . 

j = 1 J J J 3 

f(c)  is  thus  our  first  example  of  a non-analytic  function  of  a 
complex  variable  To  find  a zero  would  in  general  require  solving 
a system  of  two  equations  in  two  real  variables. 

In  the  most  interesting  case,  however,  we  would  be  interested  in 
real  perturbations  q-p  of  a real  polynomial  p.  If  5 were  complex 
then  q-p  could  not  be  real,  so  we  need  only  consider  cases  for  which 
C is  real.  Then  the  real  function  f(s)  is 


We  write  f(c)  in  this  way  for  comparison  with  the  expression  for 

f'(c): 

(2.3)  f'U)  ■ w,n2+  J wJ-a-(j)-(-c)4*2{(2J-l)(j)c2(-c)J_Z- . 

j * 

Then  we  may  use  Newton's  method  from  a suitable  starting  point  to  find 
a stationary  point  £.  f(;)  Is  evidently  a real  polynomial  of  odd 

degree  2n  - 1 so  It  does  have  at  least  one  real  zero.  We  shall  see 
later  that  even  when  n » 2 there  may  be  more  than  one  real  zero.  We 
could  In  principle  find  all  the  zeros  of  f with  a conventional  poly- 
nomial zero  finding  technique,  but  we  would  have  to  reject  most  of 
those  zeros  as  Irrelevant  since  they  would  be  complex. 

In  practice  it  appears  that  when  Newton's  method  Is  started  from 
C * -p^/n,  convergence  occurs  quickly  to  a stationary  value  which 


appears  to  be  a reasonable  candidate  for  a global  minimum.  This  choice 

of  starting  point  makes  sense  because,  when  we  consider 

p(x ) = (T-c0)n + eq(x)  for  infinitesimal  perturbations  eq,  the  solu- 

1 P1 

tion  turns  out  to  be  K = . 

Even  in  the  apparently  simple  case  of  finding  the  nearest  n-tuole 
zero  we  encounter  most  of  the  characteristic  difficulties  of  the  more 
complicated  cases  of  m-tuple  zeros  for  m < r,.  In  the  next  sections 
we  v.ill  explore  these  cases  in  detail. 
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3.  The  Nearest  Polynomial  with  a Fixed  Double  Zero 

In  the  present  section  we  will  solve  the  following  problem:  given 

a real  polynomial 

p(t)  = Tn+  l p.Tn-^  , 

what  is  the  least  real  perturbation 

q(T)  = l q<Tn"^ 
j=l  J 


such  that  p + q has  a specified  real  double  zero  c?  We  will  measure 

2 T n p 

perturbations  q by  the  familiar  norms  Iql  = q Wq  = J w.q.. 

2 W j=l  J J 

Our  problem  is  to  minimize  Iql w subject  to  the  constraints  that 
o 

p(t)  + q(x)  * (t-c)  r(x)  for  some  r of  degree  n-2.  Using  the  nota- 
tions of  the  chapter  on  condition  numbers,  then,  our  problem  is  to  find 
r to  minimize 


IP2r-plw  ■ IH1/2P2r*H1/2pl2  . 


Recall  that  P2  is  the  operator  which  multiplies  polynomials  of 

2 

degree  n-2  by  (t-c)  . 

The  solution  of  this  linear  least  squares  problem  is 

r * («''2P2)V'2p  . 

Then 

q * (P2(P5WP2)’1P2W-l)p  . 


Thus  we  can  solve  this  problem  by  the  usual  least  squares  method. 
But  when  we  do  not  specify  c in  advance  that  method  is  inapplicable 
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since  now  depends  on  5.  Therefore  we  will  look  at  a dual  formula- 
tion of  the  problem  that  can  easily  be  expanded  when  we  allow  5 to 
vary. 

0 

So  now  when  we  minimize  Hql  subject  to  (p+q)U)  = 0 and 

w 

( p+q ) ' ( c)  =0  we  will  apply  Lagrange  multipliers  according  to  the  con- 
ventional formulation.  Namely  we  will  seek  the  stationary  points  of 

♦ = J wj(q i)2  + X0(p(O+q(c))  + X,(p'(c) +q'(s)} 

3=  • 


with  respect  to  changes 

IMS)!  . aI,d 

9qj 


3q. 


in  q j . We  note  that  q(c)  = £ q-Cn~J  so 
(n-j)cn“J~1 . Thus  **  1 


0 = 2wjqj  + xocn’J  + xi(n~J>^ 


whence 


* ^j{xo?n"j + j * n and  q 


n-j-1 


_ 1^0 
n ~ 2w„  ’ 
n 


To  determine  Xq  and  X^  we  will  use  the  constraints: 


0 = (P+q)  (?)  = p(c)+4)jy^vc2)n‘j+xi(n-jk{c2)n‘j‘1)}-^ 

0 * (p+q)'(c)  8 p'(c)  + (-i)  l J-(X0(n-j)c(;2)n‘j"1 +x  (n-j)2(c2)n',i‘1) 

1 j«l  wj  U 1 


The  above  may  be  written  as  a linear  system  of  equations: 
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n « n • 

l ~u2)n-] 

j=l  wj 

C i1~(n-j)(;2)n-j-1 
j=l  wj 

’ 

X0 

r 

pi:) 

^-(n-j)(c2)n'j'1 
l J=lWj 

l J-(n-j)2(?2)n‘j'1 
j=l  wj 

j 

\ 

= 2 

P*  (c) 

4 

If  we  write  0(,  = j ^ (n-j ) k(?2) n‘J  then 
J l J 


2 

a2  ' 

f P(C) 

>A1  - 

rt  n _rr^ 
“0“2  “1 

,2_ 

<•  uo  <* 

1 / \ 
P ic) 

-l  ^ 

^ ' “j  ^7^fa2*<n-j)al)p<=>  + cC-°1+Cn-j)cro)p-COJ  . 

Then 


(3.1)  ,(1)=-^ 


r 

W2J 


(°2-  < j )c,l ) p (C ) + ? (-a,+  ( n- j )„„)  p ' ( c ) ) 
. ?n-jTn-j 


is  the  smallest  perturbation  moving  p(t)  to  the  manifold  of  polyno- 
mials having  double  zeros  at  c.  The  distance  may  be  calculated  to  be 


'«'w- 


fg;(p(c))‘-2a1p({)(cp'({))  +on(cp'(c))2l 


:0°2  ’ a 


1/2 


The  foregoing  calculation  is  invalid  when  c ■ 0.  In  that  case 

% * 'V  Vl  * 'Vr  snd  1j  * °.  1 < j < n-2. 
'^■Vi'Vi^'Al'- 
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4.  The  Nearest  Polynomial  with  a Double  Zero 

After  the  complicated  expressions  of  the  previous  section,  one 

woul'i  expect  worse  from  the  following  problem:  given  real  p,  find 

real  q such  that  p + q has  a real  double  zero  z,  not  fixed  in 

advance,  so  that  z,  may  vary.  The  final  expressions  to  be  derived 

are  surprisingly  simple,  however. 

We  could  solve  this  problem  by  differentiating  with  respect  to  z, 

2 

the  final  expression  for  llql  of  the  previous  section.  It  will  be 

w 

more  enlightening,  however,  to  make  a fresh  start.  The  direct  linear 
least  squares  solution  method  won't  work  now,  and  we  must  solve  the 
problem  with  Lagrange  multipliers.  Thus  we  seek  the  stationary  points 
of 

n ? 

v = l w,(q.)  + Xn(p+q)(c)  + X, (p+q) ‘ ( c) 
j=l  3 3 U 1 

with  respect  to  variations  in  q.  and  z,.  Then  as  before 

J 

° * 2wjqj+xocn’J+xi(n‘J,5n’J’1  * 

3 

but  now,  in  addition, 

° = |r  c X0(p+q)'(;) + X1(p+q)"(c)  . 

We  exploit  the  constraint  ( p+q ) ' ( c ) = 0 to  see  that 

0 = X1 (p+q)"(c)  • 

Remarkably  enough,  either  one  of  the  Lagrange  multipliers  is  identi- 
cally zeio  or  else  the  unknown  ; is  not  only  a double  but  a triple 
zero  of  p + q.  It  turns  out  that  stationary  points  with  (p+q) "(c)  = 0 
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and  X.j  f 0 are  almost  never  minima;  see  section  9.  Accepting  that 
assertion  for  the  time  being,  assume  = 0.  Then 


qj  = 2w.  V J' 

J 


From  the  constraint  (p+q) (c)  = 0,  we  find 


pU)  ■ K I 
L Uj  = lWj 


SO 


and 


x 2eM 
0 °o 


q.  - JL  c"-j  , 
J °0  wj 


OtTl  . 


°0  j-1  wj 


We  still  don't  know  c,  but  we  can  exploit  the  constraint 
( p+q ) ’ (c)  = 0 to  find 


ind 


(4.1) 


■P'(C)  ■ i q;(n-j)?n'J"1  * (ti)cn-3*"-M 

j=l  J ao  wj 


„ Y(i?L)(^)n"j 

z£k 1 . !i . j=v._j . 

P c a°  I (^)(?2)n"j 
j-i  wj 


is  the  equation  to  be  solved  for  c.  Apparently  it  could  be  writt;;> 
as  a polynomial  equation  of  degree  3n-2.  We  will  devote  several  sec- 
tions t.o  discussions  o*  ways  to  solve  this  equation.  Let  it  suffice 
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to  say  that  when  p is  real,  the  equation  always  has  a solution 
S = 0,  and  when  n > 2 is  even  and  pn_^  f 0 it  always  has  at  least 
one  other  real  solution  as  well. 

Once  a solution  ? has  been  found,  the  corresponding  distance  is 


- Ip(dl  . kp'(d 
" ^ °i 


There  are  usually  several  real  solutions  ? and,  surprisingly,  most 
of  them  are  local  minima,  rather  than  maxima  or  saddle  points.  It 
turns  out  that  the  maxima  are  usually  the  stationary  points  with 
(p+q)"U)  = 0.  A difficult,  unsolved  problem  is  to  find  the  c 
corresponding  to  a global  minimum  of  iqll  without  having  to  find  all. 
the  solutions 
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5.  The  Nearest  Polynomial  with  5 Fixed  m-tuple  Zero 

Using  the  notation  of  Chapter  I we  will  now  show  how  to  find  the 

nearest  polynomial  with  an  m-tuple  zero  We  wish  to  minimize 
2 

llqll  = q*Wq  subject  to  Ap  + Aq  = 0. 
w 

We  may  find  the  linear  least  squares  solution  directly.  The 
vector  W1,/2q  of  least  Euclidean  norm  solving  (AW"^)(W^q)  = -Ap 
is  just  (W^q)  = (AW  ^)  '(-Ap),  where  + denotes  pseudo  inverse. 
Since  A has  more  columns  than  rows,  and  the  rows  are  linearly 
independent, 

(AW1/2)+  = w’^Vcaw'V)'1  , 

whence 

(5.1)  q = -W"1A*(AW~1A*)"1Ap  . 

Consequently 

Jqlw  - ((Ap)*(AW’1A*r1Ap)1/2  . 


To  compare  this  with  our  earlier  results  for  real  double  zeros,  we  let 
m = 2 and  recall  that  when  m * 2, 


ti 

r • 


if 


i 

: * 

i 


so 


AW'1 A* 


e*W_1e  eVVe  ' 
e*DW-1e  e*DW"1D*e 


We  can  derive  expressions  for  the  matrix  elements  in  terms  of  the 
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l l2jiLls2rJ 

J=1  J 


Notice  that  this  is  a redefinition  of  the  a.  replacing  the  previous 
n / • \ k o i 

definition  £ ' " ' (Sjn~J  which  is  not  suitable  for  complex 
j=l  WJ 

Then 


Therefore 


e*W""'e  = a. 


e*DW-1e  = = (e*W"Ve)*  , 


e*DW-1D*e  = -~o. 

Ul22 


(AW'V)'1 


a0°2"al 


1 rr  llr 

T^2  ?"l 

T0!  a0 


and 


Iql 


a2!p(?)|2-2a1Re(p*{^)^p'(c))+a0Up,(O|2  ]/2 

~ [ 9 ] ' • 


a0a2 ’ al 


Apparently  the  major  difference  between  the  previous  real  case  and  the 

2 

present  complex  case  is  that  expressions  like  (e)  have  been  replaced 
by  expressions  like  | 0 | . The  effect  of  this  change  will  be  that  the 
equations  to  be  solved  for  c,  when  it  is  not  fixed  in  advance,  will 
no  longer  be  analytic. 
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6 . The  Nearest  Polynomial  with  an  m-tuple  Zero,  No  Longer  Fixed 

Our  problem  appears  similar  to  that  in  a previous  section:  mini- 
mize iqfl  subject  to  Ap+Aq  = 0.  The  difference  is  that  the  ; on 

W 

which  A and  A depend  is  no  longer  fixed,  and  a linear  least  squares 
theory  is  no  longer  applicable.  As  we  have  just  seen,  if  we  do  hold  ; 
fixed,  we  can  write  q as  a non-analytic  function  of  ;.  Therefore 
we  can  find  a directional  derivative  of  q if  we  think  of  ; as  a 
function  of  a real  parameter  9:  ; - ;0  + 9;.  Then  = ; and  if 

v = q*Wq 

then 

^ = v = q*Wq  + q*Wq  = 2 Re  (q*Wq) 

since  W is  constant.  At  a stationary  point  of  v we  would  require 
v = 0 for  all  q,  including  that  particular  one  which  makes  q*Wq 
real.  From  that  case  we  conclude  that 

0 * q*Wq 

is  the  condition  for  stationarity. 

But  q is  constrained  in  the  values  it  may  take.  When  we  dif- 
ferentiate that  constraint  we  find  &p  + Aa  + Aq  = 0.  Since 
* • • « 

(e*)  * (•  • • (cn"J)‘ “)  = (’ • • (r<*\jkn"J"  £•  • •)  s e*D;,  we  conclude  that 
k - AD;.  Therefore  the  constraint  on  q and  ; is  (ADp +ADq);  + Aq  * 0. 

The  idea  of  constrained  optimization  is  that  every  pair  (q,;) 
which  satisfies  the  constraint  should  also  satisfy  the  stationarity 
property,  i.e.,  in  the  notation  of  the  Lagrange  multiplier  theorem 
(Appendix  6), 


Bx  * 0 ~ y*x  * 0 
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* 


where 


and 


q 

^ • 

V 5 

B = (A  j ADp  + ADq) 
y*  = (q*W  i 0)  . 


The  Lagrange  multiplier  theorem  just  cited  assures  us  that  y may 
be  written  y = B*£  for  some  vector  i of  Lagrange  multipliers.  For 
convenience  we  will  write 


Then 

(6.1) 


' Wq  ' 

A* 

. o . 

. (ADp+ADq)*  . 

But  since  Ap  + Aq  - 0 is  the  constraint,  (ADp  + ADq)*£  * 0 
((p+q)^U))*£m_-|  = 0 and  we  are  therefore  faced  with  the  two  possi- 
bilities we  saw  in  the  m = 2 case:  either  the  last  Lagrange  multi- 
plier is  zero,  or  the  zero  c has  one  higher  multiplicity  than  we  had 
planned.  By  examining  the  second  derivative  v in  a subsequent  sec- 
tion we  will  find  that  stationary  points  with  extra  multiplicity 
corresponding  to  minima  of  v always  have  = 0.  Therefore  we  may 
always  assume  that  fc  .|  = 0 at  interesting  stationary  points. 

Continuing  we  find  Wq  = A *Z  so  q = W_1A*£.  Then  the  constraint 
implies  (AW”^A*)£  * -Ap.  Although  AVf^A*  is  Hermitian  positive 

definite  and  therefore  invertible,  we  would  find  that  £ , would  not 

m- 1 

come  out  to  be  zero  except  for  certain  special  c's.  These  special 


values  of  z,  must  correspond  to  the  stationary  points  of  v.  To  find 

A 

out  what  they  are,  we  write  £ = ( q ) and 


Ap  + (AW"V)(  J ) = 0 

or 

(Ap  AW_1A*Z)(  1 ) = 0 . 

Here 


and  it  has  the  effect  of  removing  the  last  column  of  AW-1 A*.  The 
resulting  homogeneous  equation  above  obviously  has  a nontrivial  solu- 
tion so  the  matrix  is  singular.  Therefore 

(6.2)  0 = det (Ap  i AW_1A*Z) 

is  the  equation  to  be  solved  to  find  the  c's  corresponding  to  interest- 
ing stationary  points  of  v. 

To  see  what  kind  of  equation  it  is,  consider  the  case  m = 2: 


sc 


AW"1 A* 


e*W"1e  e*W"]D*e  ' 
e*DW"1e  e*DW_1D*e 


0 = det 


p(0 

p'U) 


e*W"]e 

e*DW"]e 


P(0  -o0p'(c) 


which  we  may  write 
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(6.3) 


&M 

pU) 


ji&iVi-j 

j=1  wj 

I J-U2ln‘j 

j-1  j 


This  equation  is  evidently  not  that  of  an  analytic  function.  We  shall 
return  to  it  later.  Supposing  for  now  that  we  have  found  an  acceptable 
solution  z,  for  the  equation  above;  we  can  then  evaluate  £ from 

AW‘Vz£  = -Ap 


in  any  of  a variety  of  ways;  the  obvious  way  is  to  solve 

(Z*AW_1A*Z)£  = -Z*Ap  . 


This  equation  is  the  same  as 


where  A is  one  dimension  smaller  than  A,  i.e.,  A 
Then  q = W”^A*£  and  finally 


lqlw  3 (£*Aw_1A*£)1/2  = ((Ap)*(Aw"1A*)'1(Ap)) 


1/2 


For  the  case  m * 2 that  we  considered  previously, 

A -I  A 

AW  ’A*  - oQ  , 


£ * -p(c)/o0  , 

(6.4) 

q 3 (-p(c)/o0)W’1e  , 

and 

«qiw  B Ip(c)!/^  . 

82 


7 . Computational  Details:  The  Equation  to  Solve  for  the 
Nearest  m-tuple  Zero 

As  we  have  seen,  in  order  to  find  the  nearest  polynomial  with  a 
double  zero,  we  must  solve  the  equation 


where 


h( r } = c^pfx)  -aQTp'(T)  = 0 


m wj 


We  will  see  that  there  are  various  ways  of  solving  this  equation  for 
its  zeros  c when  x and  p are  real,  but  for  the  more  general  com- 
plex case  there  do  not  see.’  to  be  many  methods  that  work.  We  will 
usually  solve  this  equation  by  means  of  Newton's  method  applied  to  two 
real  equations  in  two  >’eal  unknowns.  In  this  section  we  will  provide 
the  expressions  necessary  for  Newton's  method  in  the  case  of  an  m-tuple 
zero. 

The  equation  we  have  to  solve  is  in  this  form: 


0 * det(Ap  ! AW’Vz) 

or,  written  out, 


p(0  e*W’]e  • • • e*W_1(!f"2)*e 

P'(0 

• • 

• • • 

p^U)  e*Dm'V1e  • • • e*Dm'V1(Dm’2)*e 


\ 


» 


No*  e*oW>e  » ^tj] 

By  multiplying  rows  and  colunns  by  powers  of  ; and  c*  we  can 


rewrite  the  determinant  without  changing  its  value  as 


P(0 

CP'(c) 


Goo  * 

G10 


cn.-lpCm-13(c) 


m-1 ,0 


u,m-2 


m-1 ,m-2 


fU) 


In  this  form  It  is  obvious  that  the  expansion  in  terms  of  minors  f™ 
the  first  column  will  yield 


f(t)  = A„p(t)  -4,{tp'(0)  + •••  + (-l)"-\.,(c"-,p["'-n(c)) 

f p(0 

= <VA,  ••• 


= v*u  . 


cp;(0 

?m-lp[m-l] 


(0  J 


Thus  f may  be  expressed  as  a scalar  product  of  (,)  a vect()r  u „f 
enalytic  functions  of  c and  (2)  a vector  v of  functions  depending 
o»lyon  a.j  and  hence  only  on  |c2|.  I„  fact  the  h.  are  real, 
analytic  functions  of  the  real  variable  |;2|. 

The  two  real  equations  which  we  shall  solve  by  Newton's  method  are 
Re  f « 0 and  Im  f = 0,  that  is, 


(7.1) 

Now 


v*  Re  u = 0 , 

v*  Im  u » 0 . 


-T  of  r * -~)*Re  u + v*(  — = /„•  \*  3(  lc2l )« 

o Ke  c aRei.  '9Rec'  'v  > g pe  ^Re  u + v*Reu' 


(7.2a) 


'3  Re  V 

2 Re  c Re  ((v')*u)  + Re(v*u') 


84 


Similarly 

§-j~-  2 1m;  Re( ( v ' )*u)  - Im(v*u')  , 
(7.2b)  14 5ts  2 Re;  Im((v')*u)  + lm(v*u')  , 

o Ke  ; 

= 2 Im;  Im((v')*u)  + Re(v*u')  . 


In  general  v*  is  a vector  whose  components  are  functions  of  the  0^ 
which  can  in  turn  be  written  as  functions  of  the  defined  earlier. 


Then  oj^  = 


Z7°k+r 


For  the  case  m = 2 we  have  v*  = (°-|"0q)  an<^  u 88 


f p(c)  ' 
CD'(c) 


Then 


and 


(v‘)*u  = — Ur(o?p(;)  -o.;p'(;)} 

Ur 

v*u'  = (a1P*  (c)  - a0(cp* (c)-P''(c) )} 


are  the  quantities  required  in  the  expressions  for  the  partial  deriva- 
tives. Those  partial  derivatives  enable  us  to  compute  the  Jacobian 
matrix  required  for  Newton's  method  in  two  dimensions. 

The  case  for  m = 3 is  more  complicated.  In  accordance  with  the 
previous  formulation, 


°10  °11 
°20  °21 


A1  s °0°3  " °0°2  ' °1°2  + °1  ’ 


- 0^  (°3”°2^  ” (°2"°i  ^2  * ala3  ” ^2  * 
2 


Lz  • o0o2  - o, 


For  simplicity  we  will  make  a slight  change: 


1 
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10  0' 

’ 1 0 0 1 

(7.3)  v*u  = v* 

0 1 0 

o 

o 

.0-11, 

l 0 1 1 J 

P(C) 

sp'U) 

sp'U)+c2p"(c)  , 


With  v*  and  u thus  redefined. 


*v')*  = i ^ 1 2 ^al°4  a2o3’"^o0o4"o2^ ,o0°3"olo2^  2 "^VVV 


(7.4) 


u' 


f p(c) 
tp'U) 

1 cV'  +3cp"+p'  j 


p 

It  may  be  observed  that  expressions  like  o-jOg-Og  involving  subtrac- 
tion of  positive  quantities  will  result  in  cancellation.  Therefore  we 
will  rewrite  those  expressions.  Let  a typical  term  be 


Then 


°a°b  " °c°d 


A*  (S-^U2rJ)G%k*Vrk) 
wj  wk 

- (I^lc2r,)(I^I<2rk) 

* I I ”lc2r'j|c2|n'k{(n-j)a(n-k)b-(n.j)c(n-k)d}  . 
j*l  k*l  wj  wk 

This  double  sum  has  an  entry  for  each  position  in  an  n by  n square 


array,  except  for  the  diagonal  entries  which  vanish.  Therefore,  we 
may  add  the  i,j  and  j,i  terms  together  and  count  only  the  terms 
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with  k > j: 


n-1 


i-  lc2!  I 5-l?2ln‘J'1[  I J-l?2ln'k{->]  . 

J-l  wj  k=j+l  wk 

{•)  = (n-j)a(n-k)b+ (n-j)b{n-k)a  - (n-j )c(n-k)^  - {n-j)d(n-k)c 


If  we  consider  A to  be  a function  of  a real  variable  | ^ 

A 

3A 


then  we 


may  define  A'  as 


n-1 


2 * 

3|r| 


Then 


a’  ■ I ^k 
j=i  wj 


1 ir2,n-j-l 


[ l jtU 

k=j+l  wk 


1 i _2 i n- k 


( n- j+n-k) { • }] 


The  expression  {•}  in  the  equations  above  has  the  following  values: 

for  Aq,  (n-j)(n-k)(k-j)2  ; 
for  Ar  (n-k+n-j)(k-j)2  ; 

for  ^2*  (k-j)2 

We  may  use  these  expressions  for  A and  A'  to  compute  v and 
v'.  Using  the  expressions  for  u (7.3)  and  u'  (7.4)  we  may  solve 
the  equations  for  the  nearest  polynomial  with  a triple  zero  (7.1).  The 
partial  derivatives  (7.2)  are  used  by  Newton's  method. 
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■ \ 

ri 

I 

i 

i 


8.  Jhe  Second  Derivative  of  llql! 

We  have  just  seen  which  equation  must  be  solved  to  find  the  sta- 
tionary points  of  llqll.  Some  of  these  points  are  local  minima;  others 
are  maxima  or  saddle  points.  To  investigate  the  nature  of  the  sta- 
tionary points  we  now  develop  expressions  for  directional  second 

2 

derivatives  of  llqll  . 

w 

Suppose  that  r = Sq  + 9;  for  0 real.  Let  the  function  to  be 
minimized  be  v = q*Wq.  As  we  have  seen, 

= v = 2 Re  (q*Wq)  = 2 Re  (Jt*Aq)  . 

But  the  constraint  Ap  + Aq  = 0 implies  Aq  = -(ADp  + ADq);  so 
v-  -2  Re  (£*(ADp  + ADq)t) . Therefore 

v = -2  Re  U*(ADp  + ADq)c  + £*(ADp  + ADq)£  + £*ADqc)  . 

Differentiating  Wq  = A *1  we  find 

Wq  = A*£  + A*£  = D*A*U*  + A*£  . 

Differentiating  (AW”  A*)$,  = -Ap  reveals  that 


AW"1A*£  + AW‘1A*£  + AW1A*5,  = -Ap 

or 

ADW'1A*t;  + AW’1D*A*£;*+AW'1A*£  = -ADp; 


so 
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where 


Thus 

(8.1) 


4>  = 4q*WDW"1A*(AW'1A*)'1(ADp  + ADq)  - 2£*(AD2p  + AD2q)  , 
ip  = -2q*WDW'Vwq  + 2(ADp + ADq)*(AW_1 A*)"1 (ADp + ADq) 

+ 2q*WDW'1A*(AW"1A*)'1AW'1D*Wq  . 


v = (Re  k 


Im?) 


' ip  + Re  <p  -Im<j> 

r Re  r 

-Im  (j)  ip  - Re  <p 

k Im  l _ 

The  eigenvalues  of  the  matrix  are  ip  ± |<)>j.  If  ip  > |<j>|  then  v 
is  concave  upward  at  ?.  If  |4>|  < -ip  then  v is  concave  downward. 
Other  possibilities  correspond  to  more  complicated  geometries.  For 
instance  if  ip  > |<j>|  at  a stationary  point,  the  point  may  be  a minimum 
or  a saddle  point,  depending  on  the  third  derivative. 

To  compute  the  components  comprising  v note  that 


, n+l-i  (n-k)!w 

(“  DKq),  - I 


and 


-1, 


n-1 


k+I 

I UJ 

,W| 

i2 


n-k-i+1 


(w.  ) 

q*WDW” ' D*Wq  = I (n-j)2  J+1-  ]q  J2 

J 


Special  Cases  for  v 

There  are  two  cases  in  which  the  previous  expression  for  v may 
be  simplified.  The  simplifications  will  become  evident  after  we  prove 
the 

Lenrna.  q*WDW  ^D*Wq  = q*WDW  ^A*(AW  ^A*)  ^AW  ^D*Wq  if  and  only  if 

m = n or  l , = 0. 
m-  I 


17  JW&mi 
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• (,>  ,'f  r"  A 1*  ^are  and  invertible  so 


(AW  A*)  - (A*)"1!^'1. 


# (2)  " **-1  - ° *hen  A*v  - ^s  a unique  solution 

° °’  V'  e° Vl=V2-  fi,so  x(v)  = fA*v-L*A*;.)*W',(fl*v-D*A*f.) 

' °‘  That  means  that  the  le«st  squares  problem 


W'/2A*u  = W^Wt  . W-1/20,W() 


“ ' S°  “t10n  “ ^ the  reS,'dua'  X(u|  must  vanish;  otherwise 
V would  he  a better  solution.  fact,  since  the  rows  of  A are 

" ■ V'  B“‘  - another  expression  for  „• 


u • ( AW" 1 A* ) " ^ AW~ 1 D*Wq  , 


Then  x(u)  •*  0 implies  the  desired  result. 


(3)  Assume  the  hypothesis  and  that  m < n;  our  goal  is  to  snow 
* V,  ■ 0-  " we  write  B - then  the  hypothesis  is 


(8.2) 


^W^o-bbV^dm**  = 0 


theory  of  the  pseudo- inverse  implies  that  1 rb"  • 

..  r nat  " BB  is  positive  semi- 

definite  for  any  B.  Therefore 


'1  - BB'  )W"^?D*A*£  = o 


and  D*A*£  . A*v  for  v = bV1/2D*A*£ 

Slnce  m < n the  rows  of 

/ill  9 ma  1 - n . 


' " ine  rows  of 

AO  ere  linearly  independent  so  the  equation  £.«  . V„A  has  . . 

. u v ft  nas  a unioue 

solution  ».  By  considering  components  we  find  that  v . „ and 

therefore  that  £.  = v t - n 1 , ° 

k k+1’  * ' . ,m-2,  and  finally  that 

*m-l  = as  claimed. 


Q.E.D. 
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The  next  simplification  lemma  is  an  easy  consequence  of  the 
foregoing. 

Lemma.  If  m = n or  l , = 0,  then 
m- 1 

v = Re(<j>c2)  +i|»!;|2 

with  <j>  = 2&*(AD2p  + AD2q)  , 

\p  = 2(ADp  + ADq)*(AW‘1A*)‘1(ADp  + ADq)  . 

, Proof.  The  assertion  about  \p  is  a direct  corollary  of  the 

previous  lemma.  To  prove  the  assertion  about  4>  requires  showing  that 
i Jl*(AD2p  + AD2q)  = q*WDW'1A*(AW"1A*)'1(ADp  + ADq). 

: (1)  If  m = n then  we  must  show  that 

- /_2(p+q)(n)(0  = JL*ADA-1  (ADp  + ADq) 

0 
• 

0 
1 

j But  = x where  x represents  (T-c)n"V(n-l  )■ . Then 

0 

* 

ADx  = 0 

n! 

* 0 

a 

(2)  If  l , = 0 we  must  show  that 
* m- 1 


| { l*  ?(p+q)(m)(U  = q*WDW’1A*(AW’1A*)‘1y(p+q)(ni)(c) 

t . m-  c 

! \ 

\ or  = u*y  for  the  u*  of  the  previous  leima.  The  right  hand 
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9.  The  Last  Lagrange  Multiplier  is  Zero  at  a Minimum 

In  a previous  section  we  saw  that  there  are  two  kinds  of  stationary 
points  for  the  norm  of  the  distance  to  the  nearest  polynomial  with  an 
m-tuple  zero.  Our  object  is  to  prove  what  we  asserted  then: 

Proposition.  Let  c represent  a stationary  point  for  llql  that 
is  locally  minimal  with  respect  to  complex  perturbations.  Then  the 
last  Lagrange  multiplier  £ .j  = 0. 

Proof.  We  know  that  all  stationary  points  for  R qlf  have  either 
£m  i = 0 or  (p+q)^(0  * 0.  Therefore  we  must  show  that  if 
(p+q)^(j;)  = 0 and  RqB  is  locally  minimal  then  £m  ^ * 0.  To  do 
this  we  will  examine  the  expression  for  the  second  derivative  obtained 
in  the  previous  sections. 

The  hypothesis,  that  ABp  + ADq  = 0,  implies  that 

4>  = - 2*J_1(p+q^nH‘1)(c) 
and 

ty  = - 2q*W0{W”1  - W"1A*(AW'1A*)"1AW‘1}D*Wq  . 

A minimum  requires  that  >_  |$|  or 

- q*WDW"1{W-A*(AW“lA*r1A}W"1D*Wq  > | £ 1 1 (p+a) *m+1  * (;)  | . 

The  quantity  in  {•}  on  the  left  is  1 - BB+  where  B * W-1/^A*. 

1 - BB+  is  positive  semidefinite  for  any  B,  so  the  left  hand  side 
must  be  <_  0.  Since  the  right  hand  side  is  >_  0,  both  sides  are  exactly 


q*WDW"]D*Wq  * q*WDW'1A*(AVf1A*)"1 AW"1 D*Wq 


0,  so 
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and 

^.,(p+0)tm+1)(c)  - 0 . 

The  first  lemma  of  the  last  section  tells  us  consequently  that 
either  £ .j  = 0,  as  claimed,  or  m = n.  But  if  m = n,  then 

(p+q)^(c)  = n!  f 0 , 

contrary  to  the  hypothesis  that  A5p  + ADq  = 0.  This  concludes  the 
proof  as  originally  worked  out  by  W.  Kahan  [19]. 

Thus  to  find  the  nearest  polynomial  with  a double  zero  it  is  only 
necessary  to  solve  the  simpler  equations  resulting  from  the  assumption 
that  the  last  Lagrange  multiplier  vanishes.  In  the  case  of  a real  poly- 
nomial, of  course,  it  may  happen  that  the  nearest  polynomial  with  a 
double  zero  is  a complex  polynomial  with  a complex  double  zero. 

The  situation  is  much  more  complicated  if  given  a real  polynomial, 
we  see  the  nearest  real  polynomial  with  a double  zero.  Then  three 
possibilities  may  arise:  the  nearest  such  polynomial  may  have  a real 

double  zero,  a real  triple  zero,  or  a conjuoate  pair  of  complex  double 
zeros.  The  last  case  is  treated  in  the  next  chapter.  That  the  second 
case  may  arise  is  illustrated  by  the  following. 

Example.  Consider  the  real  cubic  polynomial  whose  roots  are  1.0 
and  . 224  ± . 1 74i . Let  the  weights  in  the  usual  norm  be  1,  1000,  and 
10000.  Then  the  nearest  real  polynomial  with  a double  zero  is  the 
same  as  the  nearest  real  polynomial  with  a triple  zero,  which  is  at 
; = .4235...  . The  second  Lagrange  multiplier  does  not  vanish 
this  ;. 
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This  example  does  not  invalidate  the  proposition  proved  earlier 
in  this  section.  If  complex  perturbations  are  allowed,  then  when 
double  zeros  are  sought,  £ = .4235  is  a saddle  point  rather  than  a 
minimum.  The  nearest  polynomials  with  double  zeros  turn  out  to  have 
C = .4245  ± . 0993i , and  this  £ may  be  found  by  allowing  the  second 
Lagrange  multiplier  to  vanish. 

The  example  above  was  found  by  accident  while  searching  for  some- 
thing else;  see  Chapter  VI.  As  a practical  matter  it  seems  likely  that 
such  examples  are  quite  rare,  especially  when  normal  weights  are  used. 

In  all  the  other  examples  we  have  encountered,  it  was  sufficient  to  find 
all  the  closest  polynomials  with  double  zeros  and  the  closest  with  a 
complex  conjugate  pair  of  double  zeros. 
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10.  Another  Kind  of  Second  Derivative 

I?  tie  previous  sections  we  have  discussed  a directional  second 
derivative  for  v = q*Wq  which  we  compute  by  expressing  v as  a func- 
tion of  c,  the  m-tuple  zero.  Another  approach,  which  we  could  use 
numerically  as  a qualitative  check  on  the  previous  method,  is  to  compute 
a constrained  Hessian  matrix  of  partial  second  derivatives.  In  the 
next  two  sections  we  will  define  this  idea  and  explain  how  such  a 
matrix  may  be  computed.  Then  the  character  of  a stationary  point  may 
be  constmed  from  the  signs  of  the  eigenvalues  of  the  constrained 
Hessian. 

Let  f(x)  = x*Hx  be  a scalar  function  of  the  vector  x,  Then 
how  does  f vary  when  x is  constrained  to  the  null  space  of  a given 
linear  operator  L*?  L*  is  m by  n with  m < n. 

We  could  choose  a transformation  P into  a subspace  of  dimension 
n-m  so  that  the  space  P*x  satisfies  the  constraint.  Then  P*HP 
would  be  the  constrained  Hessian  and  its  signature  would  determine  the 
nature  of  the  stationary  point. 

As  far  as  computational  details  go,  we  could  let  P be  composed 
of  columns  from  the  QR  factorization  of  L;  see  Figure  1 1 1 . 1 . P of 
course  is  not  unique.  We  require  L to  be  of  full  rank  m;  that  is, 
none  of  the  constraints  are  redundant.  Then  R is  invertible  and 
L*x  = R*Q*x  = R*H*x,  so  l*x * 0 «*  H*x  = 0.  Thus  the  columns  of  P 
span  the  space  of  x satisfying  the  constraint. 

The  QR  factorization  v.f  a real  rectangular  matrix  may  be  computed 
using  the  algorithm  decompose  in  the  Wi 1 kinson-Reinsch  compendium  [35, 
pp.  113-114],  Q will  be  computed  as  a product  of  m orthoqonal 
reflector  matrices  (I  -8uu*).  As  each  is  computed,  the  correspor,dinq 
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similarity  may  be  performed  stepwise  on  H. 
of  H and  b*  a row,  then 


If  a represents  a column 


(I  - 6uu*)a  = a - 8(u*a)u  , 
b*(I  - Suu*)  = b*  - S(b*u)u* 
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1 1 . Computational  Details:  A Constrained  Hessian  for  v 

We  may  apply  the  technique  of  the  previous  section  to  compute  a 
Hessian  matrix  for  v = q*Wq  subject  to  the  constraint  Ap  + Aq  = 0. 

The  constrained  function  to  be  minimized  may  be  written 

T = q*Wq  - £*(Ap  + Aq) 

with  the  Lagrange  multipliers  S*  treated  as  independent  of  the  varia^ 
bles  q and  z,.  Unfortunately  the  complex  variables  q appear  in  the 
equation  non-analytical ly  while  the  complex  variable  s appears 
analytically  in  A and  A.  Therefore  we  will  divide  q,  £*,  and  s 
into  real  and  imaginary  parts  to  have  two  sets  of  constraints; 

Re(Ap  + Aq)  = 0 

and  Im(Ap  + Iq)  = 0 . 

Writing  out  the  resulting  expression  for  T in  scalar  form. 


T*  l w.{(Re  q.)2  + (Im  q.)2}  + l Re{X. (p+q) ^ (0) 
i»l  J >1  >1  k*0  K 


where  = pk  - iuk.  Then 


WqJ  ’ Veqj  + J^Re(>k(n,j,k)cn-J-k) 


m-1 


.n-j-k\ 


ar 


3 Re  ; 

ar 

3 Im  5 


mi1Re[x  (FH-Q)<k+1)(0)  , 
k*0  * 


m-1 

I 1 

k*0 


- I_Im('k(D+q)'k+1^(;))  , 


where  (n,j,k)  = (n-j) !/(n-j-k) ! . The  second  derivatives  are 


2 

3T 

0 Re qj): 


2w. 


92r 

(3  Im  ) 


7 


(3  Re  c) 
2 

a r 


:L-7  - I Ref).  (p+q)(k+2>(t))  .--A, 


(3  Im  c)‘ 


9Ri'VlmC  = - l Im(\(P+R)(k+2)(0)  , 


32f 


32r 


^2r 

S^qTTTTqT  = 8 Re  q°  3 O^j)  - rri'q  3Imq;  ■ 0 . 

J 


2 

3T 


3 Re  q . 3 Re  z, 

J 

2 

3 r 

3 Re  q . 3 Im  £ 

J 


I(n,j,k+l)Re(X  cn"j'k-1)  = - 9 r T 

k 3 Imq.  a Im  r 

J 


I(n,j  ,k+1  )Im(A,  cn~^'"k~1 ) * — * T 

k 3 Im  q . 3 Re  £ * 

J 


With  these  expressions  for  partial  second  derivatives  we  may  con- 
struct the  Hessian  matrix  H of  the  previous  section.  Then  the  second 
order  change  in  r,  for  a small  change 


' Re  6q  ' 

„ Im  5q 
6x  = , 

Re  c 

, Im  c _ 

will  be  6xTH6x. 

The  constraints  on  6x  should  appear  in  the  matrix  l.  Those 
constraints  may  be  found  by  differentiating  Re(Ap  + Aq)  and 
Im(Ap  + Aq).  Then 
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5 _Re( Ap+Ag)_  R /«  \ = 3_  Im(Ap+Aq) 

SReq.  j'  3 Im  q . 

3 3 

3 Im(Ap+Aq)  = . x . 3 Re(Ap+Aq) 

3 Re  q . m ’ 3lmq. 

J J 

where  u.  is  the  j'th  column  of  the  identity  matrix.  Also 

J 

~~f  RflePc'Aq)'  = Re(ADp  + ADq)  = S . 

• Im(ADp  +ADq)  - - . 

Then  the  matrix  L will  be  2n  + 2 by  2m  and  the  matrix  H will  be 
2n  + 2 by  2n  + 2. 

It  was  necessary  to  resort  to  real  arithmetic  to  deal  with  the 
non-analytic  nature  of  the  function  r.  If,  however,  we  happen  to  be 
interested  only  in  real  changes  in  real  q and  ;,  then  the  dimen- 
sions corresponding  to  imaginary  parts  may  be  omitted,  with  considerable 
saving  in  computational  effort  to  determine  the  signature  of  the  con- 
strained H. 


CHAPTER  IV 


FINDING  THE  NEAREST  REAL  POLYNOMIAL 
WITH  A COMPLEX  CONJUGATE  PAIR  OF  m-TUPLE  ZEROS 

1 . Introduction 

If  we  attempt  to  find  the  nearest  polynomial  with  an  m-tuple  zero 
using  the  methods  of  the  previous  chapter,  we  sometimes  find  that  one 
of  the  stationary  points  of  llql  corresponds  to  a complex  m-tuple 
zero  even  if  the  starting  polynomial  p is  real.  Then  q turns 
out  to  be  complex.  It  might  be  more  reasonable  to  restrict  q to  be 
real  if  p is  real.  Then  we  would  find  that  the  nearest  real  poly- 
nomial might  have  a real  m-tuple  zero,  a real  m+l-tuple  zero,  or  a 
conjugate  pair  of  complex  m-tuple  zeros. 

In  the  present  chapter  we  will  develop  the  equations  to  be  solved 
to  find  the  nearest  polynomial  with  a complex  conjugate  pair  of  m-tuple 
zeros.  In  that  development  we  will  take  care  to  divide  symbolically 
by  Im  c to  eliminate  real  solutions  £ that  we  usually  do  not  want. 
Then  we  will  develop  an  expression  for  the  second  derivative  and  show 
that  we  may  assume  that  the  last  Lagrange  multiplier  vanishes,  just  as 
in  the  previous  chapter. 
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2 . The  Nearest  Polynomial  with  a Complex  Conjugate  Pair 

of  m-tuple  Zeros 

Our  goal  is  to  minimize  v = q*Wq  subject  to  Ap + Aq  = 0 and 
Ap  + Aq  = 0.  We  assume  that  the  polynomial  p is  real,  but  the  m-tuple 
zeros  z,  and  \ are  complex  with  Im  z,f  0.  At  first  we  will  not 
require  q or  W to  be  real. 

The  second  constraint  may  be  written  Ap  + Aq  = 0 and  the  con- 
straints together  imply  A Im(q)  = 0,  since  p is  real. 

As  in  the  previous  chapter  let  c vary  in  a specified  direction 

i so  z,  = s-  + 0£,  8 real,  and  thus  the  directional  derivative  ~ 
u au 

is  i.  Then  v = 2 Re(q*Wq). 

The  result  of  differentiating  the  constraints  is 


and 


(ADp  + ADq);  + Aq  = 0 
(ADp  + ADq)?  + Aq  * 0 . 


Thus  if  the  vector  of  infinitesimal  changes  is 


x 


' Re  q ' 
Im  q 
Re  l 


then  its  constraint  is 

Cx  = 

0,  where 

r Re  A 

-Im  A 

Re(ADp+ADq) 

-Im(ADp+ADa) 

c = 

Im  A 

Re  A 

Im(ADp+ADq) 

Re(ADp+ADq) 

Re  A 

Im  A 

Re(ADp+ADq) 

-Im(ADp+ADq) 

Im  A 

-Re  A 

Im(ADp+ADq) 

Re(ADp+ADq)  j 

fry-r:  r‘ 
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Then  at  a point  where  v is  stationary  with  respect  to  changes  in 
q and  £ satisfying  the  constraint,  Cx  = 0 implies  y*x  = 0 where 

y*  = ( Re(q*W)  -Im(q*W)  0 0 ) . 

The  notation  x,  y,  and  C has  been  chosen  to  conform  to  that  of  the 
Lagrange  multiplier  theorem  of  Appendix  6.  That  theorem  states  that 

y*  » ( r*  s*  u*  v*  )C 

for  a vector  of  Lagrange  multipliers  (r*  s*  u*  v*)  of  length  4m. 
Therefore  the  components  of  y*  are 

(2.1)  Re(q*W)  = (r+u)*Re  A + (s+v)*Im  A , 

(2.2)  -Im(q*W)  = (s-v)*Re  A + (u-r)*Im  A , 

0 = r*  Re  a.  + s*  Im  a.  + u*  Re  a0  + v*  Im  a0  , 

(2.3)  1122 

0 = - r*Ima^  + s*Rea1  - uMmag  + v*Rea2  , 

where  a1  = ADp  + ADq  and  ag  * ADp  + ADq. 

Recall  the  formula  q*W  = £*A  from  the  previous  chapter.  The 
analogous  formula  now  is 

(2.4)  q*W  = S*  Re  A + **  1mA  , 

!.*  s (r+u)*  + i (s-v)*  , 

* (s+v)*  + i(u-r)*  . 


where 
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Then  substituting  into  the  constraints  yields 

Ap  + AW'1  (Re  A)1^  + AW-1  ( Im  A)T£2  = 0 , 

(2.5) 

Ap  + AW'1  (Re  A)TS,1  + AW'^Im  A)T5t2  = 0 . 

This  amounts  uo  4m  real  equations  in  4m  + 2 real  unknowns,  counting 
Re  c and  Im  As  in  the  previous  chapter,  there  must  be  a way  of 
using  (2.3)  to  eliminate  some  of  the  unknowns  in  (2.4). 

Instead  of  pursuing  t!:is  most  general  case,  let  us  digress  briefly 
to  see  what  simplifying  assumptions  might  be  helpful. 

Recall  that  for  a Hermitian  W, 

q*Wo  = (Re  q)T(Re  W)(Re  q)  + (Im  q)^(Re  W)(Im  q) 

- 2 (Re  q)T( Im  VJ)(Im  q)  . 

If  q is  real,  then  q*Wq  is  independent  of  Im  W so  W might  as 
well  be  taken  to  be  real.  From  (2.2)  and  A(Im  q)  = 0,  moreover,  we 
deduce  that 


- Im(q*W)(Im  q)  = 0 

* (Im  q)T(Re  W)(Im  q)  - (Re  q)T(Im  W)(Im  q)  . 


Consequently  if  W is  real,  then  Im  q = 0. 

Therefore  the  simplifying  assumption  we  will  make  is  that  W and 
q are  real.  Of  course,  real  solutions  q are  the  ones  most  likely  to 
be  of  interest  when  p is  real. 

Returning  to  (2.2),  with  these  assumptions  we  find 


0 = (s-v)*Re  A + (u-r)*Im  A 


s-v 

u-r 


B 


where 


' Re  A ' 
B = 

Im  A 
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We  shall  see  in  a subsequent  section  that  the  rows  of  B are  linearly 
independent.  Therefore  s = v and  u = r,  and  (2.3)  becomes 

(2.6)  £.*(ADp  + ADq)  = 0 

for  i*  - 2(r*-is*).  (2.4)  becomes 

(2.7)  q*W  = Re(£,*A)  . 

(2.6)  and  (2.7)  are  the  equations  for  stationarity  of  real  q and 
complex  £ with  respect  to  complex  variations  in  q and  £.  (2.5) 

becomes 

(2.8)  Ap  + AW_1Re(A*£)  = 0 , 

which  is  only  2m  real  equations  in  2m  + 2 real  unknowns. 

As  in  Chapter  III  we  might  hope  to  apply  (2.6),  which  implies 
that  either  the  last  Lagrange  multiplier  vanishes  or  else  the  multi- 
plicity of  4 is  m+1.  In  a subsequent  section  we  shall  see  that  we 
may  reduce  the  dimension  of  (2.8)  by  one  because  the  last  Lagrange 
multiplier  always  vanishes  at  stationary  points  which  are  local  minima. 

Consequently  we  may  assume  the  last  Lagrange  multiplier  vanishes 
when  solving  (2.8),  so  the  problem  becomes  one  of  solving  2m  real 
equations  in  2m  real  unknowns.  The  equations  are  linear  in  the  2m  - 2 
remaining  Lagrange  multipliers  and  very  non-linear  in  Re  ; and  Im 
So  as  before  we  should  eliminate  the  linear  variables  algebraically 
and  solve  for  5 numerically,  if  q were  held  fixed  temporarily  and 
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symbolic  Gaussian  elimination  were  attempted  on  the  remaining  system 
of  2m  linear  equations  in  2m  - 2 unknowns,  one  would  obtain  two 
expressions  involving  Re  c and  Im  c which  would  be  required  to 
vanish.  These  last  two  expressions  would  be  set  to  zero  and  solved 
numerically  for  Re  c and  Im  c. 

We  will  leave  the  discussion  of  arbitrary  m now  and  concentrate 
on  the  most  interesting  case,  when  m = 2.  In  this  case  (2.8) 
becomes  much  simpler.  Then 

-X 

1 ■ l 0 ] 

and 

A = 6* 
e*D 

so 


' (Re  e*)W'1(Re  e) 

-(Re  e*)W'1(Im  e)  ' 

’ Re  X ' 

' Re  p(c)  ' 

t (Im  e*)W’1(Re  e) 

-(Im  e*)W~^(Im  e) 

Im  X 

. Im  pk)  . 

' (Re  e*D)W'1(Re  e)  -(Re  e*D)W1(Im  e)  ' 

’ Re  X ' 

’ Re  p'U)  ' 

(Im  e*D)W'1(Re  e)  -(Im  e*D)W1(Im  e) 

. Im  X . 

. Im  p'(0  . 

Written  out  in  detail  for  the  usual  W: 


f 


[(Re  c"'j)2/wj  J(Recn’J)(Inic"‘j)A(j 

J(Re  cn-j)(Im  en*j)/«.  [(Im  {n-j)2/w. 

J J 


Re  x 
Im  X 


’ Re  p(c) 
. Im  p(c) 


I(n-j) (Re  cn"V/Wj  I(n-j)(Re  ;n“^)(Im  cn"j)/w.  ' 

I ( n- j ) (Re  Cn'^)(lm  ;n'^)/w.  £(r,  j)(Im  cn"^)2/w. 

J J 


Re  X ' 
Im  X 


9 


' Re  ;p'(0  ' 
. Im  ;p‘(;) 
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Write  these  last  equations  as 


AqA  = xq  and  = x] 


for  matrices  , and  vectors  A,  x^, 

the  equation 


and  x.j.  We  could  solve 


or 


F(0  = A 


-1 

0 x0 


' A1  X1 


= 0 


> 


(2.9) 


F(0  s 


Voxo 


6„A-ix 


tXl 


= 0 


where  + denotes  the  adjoint  and  D.  denotes  the  determinant  det(A.); 
e.g.  1 


In  the  equation  ?(;)  = 0,  we  have  avoided  explicit  inverses  at 
the  cost  of  introducing  extraneous  solutions,  by  multiplying  F by 
D0Dr  The  equation  F(?)  = 0 may  be  solved  trivially  by  any  real  c, 

since  then  the  0 vanish.  Since  only  the  complex  solutions  matter, 

the  real  solutions  will  just  be  a nuisance  that  will  distract  numerical 
procedures.  Therefore  we  will  discuss  divided  differences  in  the  next 
sedtion  to  see  whether  we  can  avoid  the  numerical  difficulties. 
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3 • Pmded_D-[fferences  for  the  Equating 
lorj^Complex  Conjugate  Double  Zero 
The  elation  of  the  previous  section 


F(0  = D^Jxq  - 60a*x1  = 


has  every  real 
affairs  is  that 
plied  by  86.. 


among  its  solutions.  The  reason  for  this  state  of 
F.  the  equation  we  really  wished  to  solve,  was  multi- 


Bo  ' fI(He  Cn  "')2/wJ.){£(Im  c"'^)2/Wj.) 

- {I(Re  cn"J)(Im  cn‘j)/w.)2  . 

J 

But  Im  £ divides  Im  for  anv  t s o . 

any  k > 0,  as  may  be  simply  verified 

by  induction.  Therefore  we  could  write 


6„  * dm  t)2[(I(Re  ?"'V/w jHNvAjl-flOte  ,/w.)] 

**  J J 

where  the  standard  divided  difference  symbol  a means 

,k 


\ ~ IFf  = a Polynomial  in  Im  £ and  Re  £ 

We  could  similarly  factor  out  (Im  £)2  from  Dy  It  turns  out, 

over,  that  for  real  polynomials  p,  Im  £ divides  Im(p(C))  and 

Im(cp'(c)).  We  may  denote  these  divided  differences  by  * and 
Then  A*xQ  is  “*P  CP*' 


Im  c 


more- 


f (Im  C)2  0 


Im 


f KVjl^Wj 


-KV/^"'J)/w.  ] 


I(Re  cn"J)2/w. 

V 


\ Re  p(c)  | 

A 


In  all,  then,  (Im  c)4  divides  the  upper  element  of  the  vector  6 A+x 
and  (Imt)3  divides  the  lower  element.  Have  we  found  all  possibl!  ” 


Im  c factors?  If  we  have,  the  equation  will  no  longer  be  solved  by 
every  real  c. 


To  answer  the  question,  let  z,  approach  a real  value.  Then  as 
Im  x,  0, 

h -*  £ttk)  * k;k'1  • 4P  - ■ p‘<«>  • 

^p.—  = cp"(c)  + p'(c)  . 

Then  when  we  substitute  this  information  in  the  equation 


' F,(t)  ' 

' (Im  C)4  0 ' 

(3.1) 

F(0  = 

E 

. f2U)  . 

0 (Im  z)6 

we  find  that,  for  instance, 

sVu)  ? ? 

— - (o^-OgMt)  - (o0o3-o1o2)cp'(c)  + (o0o2-op(;p"(;)+p'(;)) 


The  right  hand  side  is  just  the  equation  (III. 7.1)  to  be  solved  to 
find  the  nearest  polynomial  with  a real  triple  zero  £. 

Naively  we  might  expect  that  the  limiting  case  of  equation  (3.1), 
an  equation  for  two  complex  conjugate  double  zeros,  would  look  like 
the  equation  for  one  real  quadruple  zero,  rather  than  a triple  zero. 
That  such  is  not  the  case  shows  how  unreliable  intuition  can  be  when 
applied  to  these  problems! 

We  may  safely  conclude,  however,  that  all  factors  of  Im  c have 
been  removed  from  (3.1).  Ideally,  the  equation  for  a real  triple  zero 
should  also  be  removed  by  algebraic  means.  That  removal  is  such  a 
formidable  prospect  that  it  seems  more  attractive  just  to  numerically 
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prevent  convergence  to  those  real  c's.  Therefore  we  will  solve 
F(;)  = 0 with  F defined  as  in  (3.1),  with  the  Im  x,  factors  removed 
symbolically  but  with  convergence  to  the  real  triple  zeros  prevented 
numerically.  The  reader  interested  in  the  details  of  computing  F 
may  find  them  in  the  next  few  sections. 

In  the  previous  chapter  we  saw  that  the  nearest  real  polynomial 
with  a triple  zero  may  sometimes  also  be  the  nearest  real  polynomial 
with  a double  zero.  By  numerically  deflating  the  solutions  for  triple 
zeros  we  might  be  missing  some  interesting  information,  but  experience 
has  shown  that,  if  the  solutions  for  doufle  zeros  are  unsatisfactory, 
then  the  triple  zeros  are  much  more  efficiently  found  by  solving  the 
equations  for  triple  zeros  rather  than  allowing  the  solutions  of  the 
equations  for  complex  conjugate  pairs  to  coalesce. 


nag.-  jamma 


Computational  Details:  The  Equations  tn  Solve 
— °-r  a ComPlex  Conjugate  Pair  of  Double  7Pm^ 
We  aim  to  find  zeros  of  the  function 


F(C) 


Therefore  define 


Of  * V(Im  z)2 

and 

ti  1 /( Im  O2  0 , 

. = Alx. 

bi  J 0 1/Im  c j 1 1 


for  1 = 0,  1. 


Now 


D0  * *(Re  • CICRe  Cn-J)A„  ,/w.)2 

and  0,  is  the  same,  except  (n-j)/w.  replaces  1/w..  The 
may  be  rewritten 


formula 


(4.1) 


'o- 

J«1  J k*j+l  wk  k-J 


The  formulas  for  the  derivatives  are 


<«•*»  A-’icfTiJ-  V }-\sh 

J=1  wj  k*j+l  wk  k'J 


4 , n-k-1 


[2{n-k)Re  t A.  , + ltlV  .] 

k-J  k-j 


? o-l  1 

* — ) — t ir 

wn  j-1  “j  "-J  "-J 


m ms  ‘■***-~’  v - ^ 


m 


9D 


(4.2b) 


0 _ 


2| H2^2-  "l  -*k  iU4|"'k',I2("-k>lm^k-j  + l?|2lk-J 

j=l  wj  k=j+1  wk  ^ 


3 Im^ 


+ — "y1  — 1 -4n  f • 

wn  j=1  wj 


In  the  notati'.'1  Appendix  4, 


.r  _ 3ik 
*k  ‘ 9 Re  5 ’ 


4.„  JH. 

\ " 9 Im  c 


Now 


tQ  - (I^n-j/Wj)Re  P ' (^(Re  ^n”J)An-j/Wj)Ap 


and 


(4.3) 


at 


(IAn-j/wj)Re  P’  + (Re  p)£(2VjCj)/wj 

- (I(Re  ^ ^*n-j^9  Re  Ap^ 

- i J(Re  ;"-jCj*("-i)VjRel;n'J‘1)/wj  ; 


= - (^_j/Wj)Im  p'  + (Re  P)I(2*n-j*n-j^wj 

- (I(Re  cn’j)Vj)(FRiT  V 

n-J-1 

. ‘(Re  C"  ,•  ' im' 

Pl 


Likewise 


t,  = (K"-j)‘n-j/wj)Re  cp'  ' (I!"'j)Re  \-J/wJ)ACP'  ’ 


3t 


3Re,-(I‘"-i>W‘,3,ReUP"+P'> 

-*  .Kn-JK  ■;n-Vn.j*(n.j)V3R^n'3'1''“5  • 


(4.4) 


rtj 
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3t, 

The  expression  for  may  be  obtained  similarly  by  substi- 

tuting (n-j)/w.  for  1/w  and  cp'  for  p in  the  expression  for 
3 Im  c" 

Continuing  in  the  same  fashion, 


b0  = ' ^Rec"  ^n-j/wi^Re  p + (I(Re  Cn"j)2/w.)A  , 

" u J J p 

bl  * - (I(n-j)Re  ?n‘Vj/w.)Re  cp'  + (I(n-j)(Re  cn-j)2/„.)l  , , 
3b-  . ^ £P 

7W7  ' " P' 

(4.5)  ‘ (Re  P>I(Re  + 

* <I<Re  cn-J)z/wj)Ts^-+  lpl2(„-j)Re  C"-JReC"-J'-V 

3b0  . J 

FTrfTc  = ^Rec"  JVj/wj)Im  pl 

- (Re  p) J(Re  cn'jA|V  - (n-jU^  Imcn‘j_1)/w. 

n i o 3A 

* <I(Re  5 -J)  /w.)^-  4p[2(„.j)Re  e"-J  lmcn-j-l/w  . 

J 

The  formulas  for  the  derivatives  of  b]  can  be  obtained  by  the  usual 
substitutions. 

The  formulas  in  this  section  may  be  used  to  implement  Newton's 
method  to  solve  the  two  real  equations  F^c)  * 0 and  F^e)  = o for 
their  two  real  unknowns  Re  c and  Im 


I.  ■ - -*  m — ml 
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5.  The  Rows  of  B are  Linearly  Independent 


Corresponding  to  the  complex  operator  A of  previous  chapters, 


it  was  necessary  in  Section  2 to  define  the  real  operator  B which 


maps  Rn  to  by 


Re  e*D 


Im  e*D 


Proposition.  If  Im  c t 0 then  the  rows  of  B are  linearly 


independent. 


Corollary.  BW"^BT  is  invertible. 


Proof  of  Proposition.  We  will  show  that  B has  full  rank  2m  by 


exhibiting  a set  of  real  vectors  {qkr,  0<_k<_m-l},  such  that 


,:bv>  * 


v ” ‘ 4?*;  y ’ -V'— -v;  v.  ;^CT'  • 


- TT£-  % 
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and  a set  {q^}  such  t(lat 


In  other  words , 


,BW 


and 


0 < j < k-1  , 
j = k , 


0 < j < k-1  , 
j = k . 


The  existence  of  2m  such  real 
pendence  of  the  rows  of  B. 

Clearly,  for  either  set  of 


Vs  is  equivalent  to  the  linear  inde 

V 


for  see  real  s,t,  with  , 0.  Obvious,,  q (J)(a)  . „ . 
°iO'£k-l.  Furthermore  qk(k)(c.)  = q f 0 If 

suffice  to  let  s(x)  - real  « «Ud 

' " H'  But  what  if  * is  complex? 

It  turns  out  that  s(x)  = flT + „ • *. 

mi;  0T  + n with  real  e anri  „ * , 

mineC*’  To  see  we  must  examine  q 00(a)  „ . . ° * ^ 

for  (t.5)k:  ^ (a)-  F,rst  form  an  expression 


(i-a)k  = 


(T-m*rnm.,k.  yk)(M)J(2iImojll.J 


jyn| 
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by  the  binomial  theorem. 
Then 


and 


(x-ot)k(x-a)k 


l (*J)(T-a)k+J(2i  Ima)k"^ 
j=0  J 


— r{(T-a)k(x-a)k} 
dx 


k 

v 


and 


{(x-a)k(x-a)k) 

dxr 


fO  , r < k , 

x=a  lk!(2  Ima)k*ik  , r = k 


We  may  now  invoke  Leibniz'  rule. 


to  find 


Dk(ps)  = l (,)(Dk'jp)(Dks)  , 

j*0  J 


-^(x-a)k(x-a)ks(x)} 

dx 


(k) , 


k!(2  Ima)k*iks(a)  . 


x=a 


This  expression  for  q^v  '(a)  shows  that  it  is  only  necessary  to  choose 

an  appropriate  real  s of  degree  at  most  1 to  get  any  desired  complex 
(kl 

value  of  qkv  ;(a).  If  u is  the  desired  complex  value  of  s(a)  then 

Re  s(a)  = Re(0a  + n)  = 0Rea  + n=Rew; 

Im  s(a)  = 0 Ima  = Im  w . 


Thus  9 = Im  w/Im  a,  n = Re  w - 0 Re  a,  so  we  can  construct  s and 
therefore  each  q^  and  q^.  So  the  rows  of  B are  linearly  inde- 
pendent as  claimed. 
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The  Last  Lagrange  Multiplier  i s Zero 

Section  2 demonstrates  that  there  are  two  kinds  of  stationary 


points  for  v = q*Wq,  q real,  namely  those  for  which  the  last 


Lagrange  multiplier  vanishes,  and  those  for  which  the  multiplicity  is 
greater  than  anticipated,  so  that  (p+q)^(c)  = 0. 


Proposition.  Let  i represent  a stationary  point  for  ||q||  that 
is  locally  minimal  with  respect  to  complex  perturbations  of  5.  Then 
the  last  Lagrange  multiplier  £m  ^ = 0. 


‘1 

't 

■> 

m 

t 

i 

.» 

V 

V 

f 

1 i 


Proof.  Since  v = q*Wq,  v = 2Re(q*Wq).  But  Wq  = Re(A*Jl)  for 
a complex  vector  £ of  Lagrange  multipliers.  Therefore 

v = 2Re(£*Aq)  = -t  Re  (£*(ADp  + ADq)c) 

because  of  the  constraint  Ap  + Ac  = 0.  Then 

(6.1)  v = -2  Re  (£*(ADp+ADq)c J t*(AD2p+AD2q)£2  + £*ADq£}  . 

Assume  now  that  we  are  at  one  of  the  stationary  points  with  ADp + ADq  = 0. 
Our  next  task  is  to  obtain  expressions  for  q and  £.  Differentiating 
the  constraint  reveals  that 


1 


AW-1  Re  (A*£)  = - AW-1  Re  (D*A *£c)  , 


deduce  that 


. _ Re  £ -i 

BW'V  = -B  Re  (W~  D*A*£c)  , 


as  in  previous  sections.  Since  the  rows  of  B are  linearly  indepen- 


dent, BW"  B is  positive  definite  and 


(6.2) 


- (BW'V)'^  Re  (W_lD*A*£c)  , 


q = W"1  Re  (A*£  + D*A*£^)  . 


Re(A*£)  = - BT(BW"1BT)'1B  Re  (W_1D*A*£c) 


(6.3) 


q = W'1(W-BT(BW"1BT)'1B)W'1D*Re  (A*£C) 


Recall  that 


v = -2  Re  (£*(AD2p + AD2q)£2)  - 2 Re  (£*ADq£) 


We  may  write 


(6.4)  Re(£*ADq£)  = Re( ;t*A)DW_1  (W  - BT(BW"1BT)*1B)W"1  D*  Re  (A*U) 


The  matrix  (W  - 8T(BW'''bT)"''b)  is  positive  semidefinite  so  both  sides 


are  real  and  >_  0. 
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As  in  the  previous  chapter  we  may  write 


v = -2 (Re  i Im  i) 


Re  4)  + (Re  b)^M(Re  b) 
-Im  4>  + (Im  b)TM(Re  b) 


-Im  4>  + ( Im  b)TM(Re  b)  [Res' 
-Re  4>  + (Im b)TM(  Im  b)  lime 


where 


b = D*A*£  , 

M = W“1/2(l  - (W'1/2B)[(W'1/2B)T(W‘1/2B)]‘1(W‘1/2B)T)W'1/2  , 
and  4>  = £*(AD2p  + AD2q)  . 

Then  a tedious  but  straightforward  argument  paralleling  that  of 
Section  III. 9 shows  that  v _>  0 for  all  r,  implies  4>  = 0. 
Alternatively  we  may  recognize  that  for  a suitable  e, 

v = -2{|£*(AD2p+AD?q)£2|  + Re(fc*ADqe)>  . 

At  a local  minimum  v >_  0 for  all  e;  recall  (6.4)  to  see  that 
£*(AD2p+AD2q)  = 0 and  also  £*ADq  = 0. 

Thus  by  either  argument,  at  a stationary  point  which  is  also  a 
minimum,  £ .|  = 0 or  (p+q)^m+^(e)  = 0.  In  the  f.rst  case  we  are 
finished.  The  second  case  implies  that  n 2(m+2). 

Furthermore,  £*ADq  = 0 and  (6.3)  tell  us  that 

Re(£*A)DW'1{W-BT(BW'1BT)'1B}W'1D*Re(A*£)  * 0 . 

Since  the  matrix  in  brackets  is  positive  senidefinite, 

{W-BT(BW*1BT)‘1B)W'1D*Re  (A*£)  * 0 
and 

Re(£*A)D  = sTB 
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where 

sT  = Re(£*A)DW"1BT(BW"1BT)_1 

so  sT  is  real. 

Our  next  goal  is  to  construct  a matrix  like  B,  but  augmented  by 
two  more  rows,  from  which  we  can  conclude  the  result.  Partition  s* 
and  sT  as  follows: 

£*  = (£,X)*  , 
sT  = (u  u e V)T  . 


X,  y,  and  8 are  scalars.  Then 


Re(£*A)  = (Re  £)TReA  - (Im  £)TImA 


and 


Re(£*A)D  = (Re£  Re  X -Im£  -ImX) 


Finally  let 


so 


and 


Then 


A = 


A = 


AD  = 


f e*D  1 

e>-> 

(V) 


( e*0m  ) • 


Re  AD 
Im  AD 


ReU*A)D  - s 6 = 0 


may  be  written 
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(y,Re*T-uT,  Re  X,  0,  -Im  -Im  X) 


Re  e* 
Re  A 
Re  e*D' 
Im  e* 
Im  A 
Im  e*D' 


,m 


,m 


= 0 


The  matrix  on  the  right  is  just  a B matrix,  but  for  m augmented 
by  1.  Since  n >_  2(m+2),  the  augmented  matrix  has  at  most  n rows 
which  are  linearly  independent.  Consequently  i = u-iv,  6 = 0, 
y = 0,  and  X = 0.  But  this  X is  just  the  last  Lagrange  multiplier 
l .j , concluding  the  proof. 

We  learned  in  the  previous  chapter  that  to  find  the  nearest  real 
polynomial  with  a real  double  zero,  it  might  be  necessary  to  solve 
equations  for  a real  double  zero  and  equations  for  a real  triple  zero. 
But  in  this  chapter  we  have  the  more  satisfactory  result  that  to  find 
the  nearest  real  polynomial  with  a complex  conjugate  pair  of  double 
zeros,  we  need  solve  only  one  set  of  equations;  it  is  not  necessary  to 
look  for  the  nearest  real  polynomial  with  a complex  conjugate  pair  of 
triple  zeros. 


CHAPTER  V 


FINDING  THE  NEAREST  POLYNOMIAL  WITH  MORE  THAN  ONE  MULTIPLE  ZERO 
1 . Introduction 

Previous  chapters  have  exhibited  the  equations  to  be  solved  to 
find  the  nearest  polynomial  with  one  multiple  zero  or  one  pair  of  com- 
plex conjugate  multiple  zeros.  Now  we  turn  to  the  more  general  problem 
of  finding  the  nearest  polynomial  with  a specified  configuration  of 
multiple  zeros.  We  shall  see  that  despite  some  complications  the 
theory  bears  a family  resemblance  to  what  has  gone  before.  We  shall 
find  that,  in  the  complex  case,  the  equations  to  be  solved  for  the 
multiple  zeros  assume  forms  simpler  than  what  might  have  been  expected, 
because  certain  Lagrange  multipliers  vanish.  However  there  is  some 
doubt,  in  general,  as  to  which  of  these  simpler  equations  should  be 
solved  for  the  multiple  zeros.  Fortunately  when  all  the  zeros  are 
double  the  equations  to  solve  are  fairly  obvious. 

Unfortunately,  just  as  in  the  case  of  the  complex  conjugate 
multiple  zeros,  the  equations  we  solve  become  much  more  complicated 
when  divided  differences  are  taken  in  order  to  inhibit  unwanted  coales- 
cence of  the  multiple  zeros.  These  equations  are  given  in  full  detail 
for  the  case  of  several  double  zeros,  and  especially  for  the  case  of 
two  double  zeros. 
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2-  jtmipie  Zeros 

Given  a complex  polynomial  p(T)  We  ZT7T" 

,ni  , Seek  the  neares‘  Polynomial 

(P+q)(t)  such  that  D + Q hac  v ™ 1 

P q has  k implex  multiple  zeros  c..  Each 
Ci  has  a multiplicity  m.  >2,  and  <n  r ’ 

1 i i - n*  Corresponding  to  the 

operator  A of  previous  chapters  we  define  A.  by 


n 


ei  ls  the  evfl1uation  functional  for 
IS  defined  analogously  with  e(*  replacing 


The  by  n+1  operator  A 
3 * 1 
ei  • Then  The  equation 


0 


expresses  the  constraint  that  p + q has  an  m t 1 

v " nas  an  m-tuple  zero  r 

'■  - i * 


We  also  define  the  operator 


which  may  be  seen  to  be  somewhat  like  the  S 
it  will  be  used  for  similar  purposes. 


of  the  previous  chapter; 


Opposition,  if 
linearly  independent. 


when  i f j 


then  the  rows  of  S are 
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Corollary.  SW’^S*  is  invertible. 

Proof  of  Proposition.  We  will  show  that  S has  full  rank  by 
displaying  £m.  1 i nearly  independent  vectors 

Sq.  , 1 < j < k,  0 < r < m.-l  . 

J »r  J 

The  q.  are  defined  by  their  corresponding  polynomials  as 
J » r 

„ m. 

qi  Jt)  s (t-cJ  n (t-c.)  1 

J.r  j .f.  , 

and  the  conclusion  follows  immediately. 

Our  goal  is  to  minimize  v = q*Wq  subject  to  A^p  + A.q  = 0, 

1 £ i <_  k.  Let  the  raised  dot  (*)  represent  differentiation  in  a 

particular  direction  of  a specific  ;.(9)  = ; . (0)  + e£ . . Then 

J J J J 

as  usual 

v * ^ - 2 Re  (q*Wq)  . 

Differentiate  the  j'th  constraint  to  find 

(AjDp  + AjDq + k.q  = 0 , 

but  differentiate  the  other  k-1  constraints  to  find 


A.q  « 0 , i f j , 


because  A.  is  independent  of 

3 

By  applying  the  Lagrange  multiplier  theorem  of 
stationary  point,  discover  in  the  usual  way  that 


Appendix  6 at  a 


(2.1) 


q*Vf  = V t.*A. 

T 1 1 


= £*S  . 


124 


There  are  k vectors  JL*  of  Lagrange  multipliers  and  £*  is  their 
concatenation.  Furthermore 

(2.2)  £j*(A..Dp  + AjDq)  = 0 
for  1 <_  j < k. 

Thus  at  a stationary  point,  for  each  j,  either  its  last  Lagrange 
multiplier  l*.  , vanishes  or  c.  has  multiplicity  one  greater  than 

J v ' j ' J 

expected.  In  the  next  section  we  will  see  how  the  techniques  of  pre- 
vious chapters  can  be  applied  to  show  that  the  minima  of  v always 
have  i*-  1=0. 

Now  when  we  substitute  in  the  constraints  we  find 

A.p  + A..W~^S*£  = 0 , 1 < i < k , 

or 

(2.3)  SW_1S*£  = - Sp  . 

Since  the  rows  of  S are  linearly  independent,  SW-1S*  is  positive 
definite  symmetric  and  therefore  invertible.  But  we  may  assume  that 
k elements  of  i vanish,  so  we  have  linear  equations  in 
(£m.)  -k  unknowns.  The  attempt  to  solve  such  a system  by  Gaussian 
elimination  yields  k expressions  which  must  vanish.  The  corres- 
ponding k non-linear  equations  in  the  c.  may  in  principle  be  solved 
for  the  ^ . In  subsequent  sections  we  will  display  equations  for  the 


case  that  all 
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3.  The  Last  Lagrange  Multipliers  are  Zero 

From  the  previous  section  we  may  deduce  that 

v = 2 Re(q*Wq)  = 2 Re(£*Sq)  = -2  Re(fc*(A.Dp+A.Dq)c.)  . 

J J J J 

When  v is  stationary,  then  for  each  j,  either  its  last  Lagrange 
multiplier  vanishes  or  the  multiplicity  of  5.  is  one  greater  than 

J 

expected. 

Proposition.  Assume  i f j -*  c.  f t . Then  at  a stationary  point 

' J 

at  which  v is  minimal  with  respect  to  complex  perturbations  in  r., 

'J 

the  last  Lagrange  multiplier  in  £.  vanishes. 

J 

Proof.  Continue  to  differentiate  the  expression  for  v above: 
v = -2  Re{£*(AJ.Dp+Aj.Dq)5j  + £*(A.D2p+AjD2q)^  + £*A.Dqs}  . 

Assume  that  A.Dp  + A.Dq  = 0 at  a stationary  point,  which  simplifies 

J J 

the  expression  for  v above.  Furthermore  the  assumption  means  that 
£m.j  < n because  k 2 and  all  5. 's  are  distinct. 

From  (2.1 ) , 

q * W_1§*£  v W"^S*£  , 

and  from  the  constraint  and  the  assumption, 

Sq  « 0 . 

Therefore 

-lx  _i • ~ 

SW  S*£  * - SW  S*£  . 

But 

§*£  * JA?£.  = D*A*i jCj  , 


SO 
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nt 


i = - (SW-1S*)'1SW-1D*A*Jl.!.  , 

J J J 

q = W D*AjJLjcj  - W'1S*(SW'1S*)-1SW"1D*A*Ji.J.  , 

J J J 
and 

J*A  Dqj  . »jAjDW-1/2{l.(w-1/2S.)(H-1/2S*)V,/2D*An.|J  I2 

J J'  j'  ' 

"I* 

(1-MM  ) is  positive  semidefinite  for  any  M so 
v = - 2 ReU*(AjD2p  + AjD2q)c?} 

- 2(jiJaj.dw-1/2{i  - (•.■■1/2s*)(w-1/2s*)+}w-1/2D*A*t..)|c.l2  . 

J J 0 

If  V is  to  have  a local  minimum  then  3 > 0 for  any  j.,  yet  by  apt 

choice  of  Cj  we  may  arrange  for  both  terms  to  be  real  and  negative, 
so  they  both  must  vanish: 

+ / >,-  + !) 

*j(p+q)  J (^o  = o 

and 

(3.1)  £jAjDW"1/2{l  " (W'V2S*)(W1/2S*)+}  = o . 

From  this  point  we  follow  the  argument  of  III, 8 to  show  that  x!, 
the  last  element  of  £*,  vanishes.  From  (3.1)  we  find 

J^A.D  = v*S 

J J 

where  V*  . Now  partition  V conformally 

wi  th  S so 

v*S  = [v* A.  . 

L l i 

Introduce  an  augmented  operator 

I 

i 


MgfljftfMftt-illr'  i f r8 


JgVj. 


m 
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I™,-  + 1 • 


Then  we  may  rewrite  the  equation 


v*S  - £TA,D  = 0 

J J 

as 

(3.2)  (v*  •••  v*_1  vT  v*+-  •••  v*)§  = 0 , 

where  vT  = (vt  0)  - (0  £*).  Since  Jm.  < n,  the  rows  of  S are 

J J J ' 

linearly  independent,  so  the  vector  in  (3.2)  vanishes.  In  particular, 
the  last  element  of  v*  which  is  -X*  the  last  Lagrange  multipler, 

J J 

vanishes  as  claimed,  completing  the  proof. 


As  in  Chapter  III,  the  present  result  applies  when  complex  pertur- 
bations are  considered.  In  the  case  of  real  perturbations  of  a real 
polynomial,  the  result  is  known  to  be  false  in  general  for  k = 1 and 
counterexamples  could  probably  be  constructed  for  larger  k.  It  seems 
likely,  however,  that  in  most  practical  problems  satisfactory  results 
may  be  obtained  by  assuming  that  the  last  Lagrange  multipliers  vanish. 
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4.  Equations  for  k Real  Double  Zeros 

The  nearest  polynomial  with  k real  double  zeros  is  of  interest 
in  studying  polynomials  like  Wilkinson's  (Chapter  X).  The  formulas 
we  shall  derive  have  not  been  treated  by  means  of  divided  differences. 
Section  6 contains  formulas  for  the  case  k = 2 derived  with  the  aid 
of  divided  differences. 

The  equation  we  wish  to  solve  is  (2.3); 

SW_1S*£  = - Ip  . 

We  know  that  the  last  elements  vanish  for  each  £. , a subvector  of  £. 
Therefore  we  may  define  the  vector  A by  letting  A.  be  the  first 
element  of  JL . Then 

S*£  = £A|£..  = . 

Recall  that  ej  is  the  evaluation  functional  for  ;.. 

Having  eliminated  some  of  the  unknowns  we  are  left  with  2k  equa- 
tions in  the  2k  variables  A1  ,A2 Ak  and  ,£k.  Since 

the  equations  are  linear  in  the  A^'s  we  can  easily  eliminate  them, 
leaving  k non-linear  equations  in  the  c's.  To  do  this  divide  the 
equation  (2.3)  into  two  pieces: 

sow  soA  = ‘ ^0P 

and 

(4.1)  $lW_1S*A  * - S,p 


where 
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To  simplify  matters  later  multiply  (4.1)  by  the  matrix 
Z = diagU-j,...,^).  Then  if  we  define 

To  ■ sow'lso  * 

Ti  * zsi“‘1so  ■ 

f p(t1)  ' 

v0  = s0p  = : 

. p(ck)  > 

vi  = ZV  = : 

V’(?k)  - 

where 

(To>ij  • and  - v*0""1"-  ■ 

then  we  may  eliminate  A and  try  to  find  zeros  of  the  function 

(4.2)  F(z)  = Aq-A1  * T-1v0-T’1v1  , 

where  z = U-j . . ,sk)  and  F are  k-vectors. 

To  keep  the  following  computational  details  simple,  we  restrict 
attention  to  real  ^ . We  wish  to  solve  (4.2)  by  Newton's  method;  to 
get  the  necessary  derivatives  let  (*)  represent  -J-  and  recall  that 
(M*1)  = - for  invertible  matrices  M.  Thus 

(4.3)  Kz>  ■ (To1)v0  + T0  "0  ’ (T11)V1  * Ti1;i 

■ T0  ^O'V'd^  " T1  • 


Now 
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II 

o 

• 1— 

V'1s0  + V’Sq 

j 

0 

( 

• • 

[j 

eijrDW“1e.  ••• 

J 

+ 

0 e*W_1D*e.  0 
1 J 

•:  J 

, J 
1 

0 

J 

(ToVi  = Ao>jc1Hn-r)(clS‘)n-r-1/wr  , 1 1 j , 

(Vo>j  ' l Ao,1ci  (n-f) (CjCt)n'r'1/wr 
+ Ao,J«1Hn-r)(?i?|)n""'/wr  , 

(TlAi>i  = AijCi  I ('i-r)2(cisj)n'r'1/wr  , 1 f j , 

<Wj  = I (n'r)2(«j?f)ri'r'1/wr 

+ AlljCiI("->-)2(Ci«)n'r-1/wr  . 
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5 • Def 1 ati on  for  Several  Double  Zeros 

When  solving  equation  (4.2)  for  polynomials  of  degrees  only 
modestly  larger  than  2k,  one  often  finds  that  zeros  of  F are  quite 
abundant.  In  order  to  prevent  reconvergence  to  zeros  already  found, 
some  sort  of  deflation  is  required. 

Unless  further  steps  are  taken,  moreover,  convergence  will  occur 
to  solutions  in  which  some  of  the  ostensibly  distinct  ^ have  coalesced. 
This  behavior  must  also  be  suppressed;  we  shall  do  so  numerically. 

A workable  approach  is  to  find  the  zeros  of  G,  rather  than  F, 

where 

G(z)  = F(z)/A 
A = A-jAg  » 

a,  e n (c.-c  )2 

1 • i r 

i>r 


for  elements  c.  and  r,  of  z,  and 
i r 


A?  e n iiz-zsn2 

4 s c 


for  known  zeros  z of  F. 

if  we  C)  ■ af:  «-» 


t • 


G = F/A  - (A/A)G  . 


We  know  that 


and  we  find  that 


(A/A)  = (A^)  + (A2/A2) 


'■i  = = 2 ' , ■ . 

1 1 i|>j  J 1 2 ' i !Z-ZS!!2 
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(6.1) 


0 l?r52l2  wj  k=I+l  wk  ( 1 51  s2 1 2) "‘k ( Afc- j 1 2 • 


In  the  last  expression,  the  divided  difference  t.  , , 

is  a polynomial  in  r and  r * 2 1 * 

" ana  C,  for  any  i > n Tha  „ 

2 y — u*  'he  corresponding 


result  for  $1  is 


To  apply  Newton's  method  the  derivatives  will  be 
and  ;2  are  real: 


(6.2)  . i.  Vj  ,»V 

0^-,  W, 


required;  assume  r 


1 n j=l  w. 


w,  3? 


1 


„ ? n-2  . n-1  , 

+ I — I — A ^r2  2»n-k-l  r/  . 3A.  . 

2 J=1  Wj  H+l  wk  k-J(?l?2}  Un-kjA^.  + -J^L}] 

Since  t,  and  ?2  are  s^etrio  in  (6.„,  jg  ray  be  obt3jned  by 
interchanging  the  roles  of  c,  and  ?2  in  (6.1).  S1.11irly  !fl 

may  be  °btained  by  (n-JI/Wj  for  1/w  and  (n-k)/’ 


for  l/wk. 


When  finding  zeros  we  will  need  to  compute  *• 

nf  * + P V the  first  element 

of  the  vector  pv  anH  a , . 

0 0-  ailtl  9,.  the  fi irst  element  of  the  vecto-  dv 
Then  IT 


80  5 V<t,-t2)  * j (u*)n-jA 


?>  ® ./w. 

j=l  2 P.n-J  J 


(6.3) 
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Now 


P.n-j 


t"'JP<gi)-c?~Jp(t,) 


S1  " ?2 


VJ  is  a polynomial  c.  and  ^ the  details  of  its  construe 
tion  are  given  in  Appendix  5.  Similarly 


(6'4)  e~l  5 V<?,-?2>  * c,lc2l2  , 

j=1  2 P »n-j-l‘ 


where 


* 


A , 

P ,n-j 


The  derivatives  of  the  §'s  will  also  be 


needed,  in  the  real  case 


?0  _ 1 3Ap  n "I1  n i 
M " wT  + h I $ 3 


(6.5) 


d^l  wn  ^ • 

ae0  . 1 \o  - n-i-1  9An  n • 

^ VSg^Mn-. 


*2  wn  L 2 U2  atp^n-jUp^.J/w 

39 

)/w 

1 1 di»i  P ,n-j-lJ/wj  » 

39 

C1C2I(n-j)cJ-J-1{(n.J+i,A  . + n-j-K. 

2 c P »n-J-l  C2  3^~}/wj  • 


We  could  find  zeros  of  the  function 


b^t  for  simplicity 
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P(z)  = Vi  . 4,  , t 

Icr'2l  («,-c2)  2 ' Tvcp"Tovo  - rc7%rT,S  ■ 

F(2)  ‘ ° ’*  * System  of  tw°  The  „m  one  „ 

(6.6) 


A 

6, 


Vo"Vi  = 0 • 


The  second  equation  may  be  obtained  from  (6.6)  by  reversing 
occurrences  of  r anH  r . versing  all 

Th  1 ?2  10  the  expressi’ons  for  the  g«s  , A 

The  appropriate  derivative  k {s  and  6 s. 

*’Ves  my  be  computed  similarly 
Now  that  a specific  equation,  (6.6,  1s  rMH  /' 

f0P  COmPUti"9  th*  -nous  divided  differences  that  “ ^ S°'''ed'  meth°dS 

56  reqU1>edi  »“  -**  - Appendix  5.  We  JT 
question:  what  happens  when  C]  . ^ " "ow  to  the 

The  original  function  (4.2)  is  undefined  when  r - 
modified  equation  ?1  ^2’ 

S1 T0V0  - S0Tt v7  - 0 

turns  out  to  be  satic-fio^  < 

e satisfied  whenever  r 3 r a 

VerSi0"-  (6'6»-  'a  -t  so  easily  satisfied'',/  “*  ^ 
bo  its  terms  as  r + * us  examine  what  happens 

*1  ^2’ 

Ws  discover  that 


* &tk)  * • 
C,.C2Vp.k  ' ; P'(C)  - kCk_Tp(c)  , 

Al.'4m*tV.lc-AV(C).k?k-'p.{t)  . 


Substituting  these  expressions  in  (6.6)  and  simplifying  leads  even- 
tually to  the  equation  to  be  solved  for  the  nearest  triple  zero 
(III.7.1).  Recall  that  the  case  of  a complex  conjugate  pair  also  reduced 
to  a triple  zero  when  the  divided  differences  became  confluent.  Just 
as  in  that  case,  numerical  methods  will  be  required  to  inhibit  conver- 
gence to  the  triple  zero  solutions  we  wish  to  avoid. 

Both  the  method  of  this  section  and  the  method  for  k > 1 double 
zeros  may  be  used  when  two  double  zeros  are  required.  Both  methods 
seem  to  work  satisfactorily  for  polynomials  of  low  degree,  but  the 
general  method  for  k double  zeros  worked  better  for  Wilkinson's  poly- 
nomial of  degree  20  discussed  in  Chapter  X.  The  equations  described  in 
this  section  seem  to  have  a much  greater  propensity  for  causing  Newton's 
method  to  dawdle  aimlessly  without  converging.  It  may  be  that  the 
divided  differences  warp  the  geometry  of  the  function  whose  zeros  are 
sought  in  a way  that  tends  to  conceal  the  zeros.  There  is  some  compen- 
sation in  the  fact  that  those  divided  differences  help  prevent  coales- 
cence of  the  zeros  much  more  effectively  than  numerical  means  alone. 


CHAPTER  VI 


LOCATION  THEORY  FOR  NEAREST  POLYNOMIALS  WITH  A DOUBLE  ZERO 
I . Introduction 

In  this  chapter  may  be  found  some  clues  to  the  answer  to  the 
question:  Given  a polynomial  p,  all  of  whose  zeros  are  simple,  where 

should  we  look  to  find  the  nearest  polynomial  p + q with  a double 
zero  5?  That  c which  minimizes  iql  globally  is  one  of  the  solu- 
tions of  the  equation 

(1.1)  FU)  = c^pU)  -o0;p'(O  = 0 ; 

but  there  are  usually  many  other  solutions,  most  of  which  represent 
local  minima. 

Remember  that  the  real  non-analytic  functions  aQ  and  a-|  are 
defined  as 

°o*j,l‘2ln-S  * 

o,  = [ |c2|n”^(n-.j )/w.  . 

1 j«l  J 

Thus  we  are  considering  only  the  norms  derived  from  diagonal  Hermitian 
quadratic  forms.  Most  of  the  results  to  follow,  moreover,  only  apply 
to  real  polynomials  p. 

The  purpose  of  atismpting  to  develop  a theory  of  location  is  to 
make  our  numerical  solution  procedures  more  efficient.  Equation  (1.1) 
is  typically  solved  by  Newton's  method  from  some  starting  point.  An 
ideal  ‘•.♦arting  point  would  have  the  property  that  Newton’s  method 
would  always  converge  to  the  global  minimum  corresponding  to  the 
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nearest  polynomial  with  a double  zero.  A satisfactory  starting  point 
would  always  converge  to  a local  minimum  that  is  nearly  globally 
minimal.  The  ad  hoc  starting  procedures  discussed  in  Chapter  VIII 

usually  seem  to  be  satisfactory  but  the  known  theory  is  insufficient 
to  account  for  their  success. 

The  results  in  the  following  sections  seem  far  from  optimal.  One 
might  hope  that  a theory  could  be  developed  comparable  to  the  elegant 
theory  of  the  location  of  zeros  of  polynomials  discussed  by  Harden  121, 
and  Householder  [12,.  But  much  of  the  theory  for  polynia, s hinges 
on  the  entire  analytic  nature  of  pol^ial  functions.  Certain  of  the 
examples  to  follow  effectively  counter  some  of  the  conjectures  that 
might  be  made  by  analogy  with  the  polynomial  case. 

We  can  make  a few  preliminary  observations  about  (1.1).  Among 
its  solutions  are  the  global  minimum  we  seek,  numerous  other  local 
minima,  a few  non-minimal  stationary  points,  and  the  solution  ; , 0. 

This  solution  CO  is  an  artifact  of  the  way  we  wrote  the  equation. 

We  could  just  as  well  divide  by  c and  write 


(1.2) 


T»en  co  Is  a solution  of  this  equation  roly  if  p'(0)  . 0;  that 

is.  only  if  the  next  to  last  coefficient  pn_,  . 0.  An  examination  of 

the  stationary  condition  q*w  . t*A  tells  us  that  qn  , . 0 while 

the  constraint  Ap  + Aq  » 0 tells  us  that  q -p  Th»r.f 

Mn-1  pn-r  Therefore 

c ■ o is  a stationary  point  for  |q|  if  and  otlly  jf  . Q ^ 
then  c = 0 need  not  represent  a minimum. 


rnratam  -w-v  r i~  -svi* 
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Since  uhe  factor  5 does 
why  not  leave  it  out  in 


™t  seem  to  contribute  any  information, 
our  subsequent  analyses?  We  keep  it  for  a 
reason  which  becomes  apparent  when  we  write  (1.1)  in  yet  , third  form- 


(1.3) 

Now 


CP*  fg)  Jl<g)  , 
PUT  " 5 R(?)  • 


* (I ( I C2 1 n-J/Wj ) • (n-J )] / f J ( | 1 n-j/Wj )} 
may  be  thought  of  as  a weighted 


we  do  so  then  we  realize  that 


average  of  the  quantities  (n-j).  if 


for 


0 £ R(C)  < n-1 

0 £ Id  < • . 

Thus  0.3)  equates  a meromorphic  function  of  the  complex  variable 

to  a bounded  positive  real  function  of  |t|,  which  is  in  fact 

analytic  when  egarded  as  a real  function  of  a real  variable.  ,f  the 

actor  of  t were  removed  from  (1.3,  it  would  lose  its  attractive  fonn 
We  will  exploit  that  form  later. 

A typical  result  in  this  theory  is  the  following. 

Proposition.  ut  „ be  real  w1t„  tw0  real  zeros  and 
al  £ <*2‘  Then  r 
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Proof.  If  a i and  a2  have  opposite  signs  or  if  either  is  zero, 
then  c = 0 satisfies  the  assertion.  Then  without  loss  of  generality 
assume  that  0 < < a2  and  that  (c^.c^)  contains  no  real  zero  of 

p.  Then 

F(a1)F(a2)  s ^(a^a-jC^p'  (a^  )p'  (a2)  . 

If  that  product  is  zero  or  negative  then  a zero  of  F lies  in  [a^ ,a2] 
by  the  intermediate  value  theorem.  But  if  that  product  is  positive 
then  p'(a^)p'(a2)  > 0.  Considering  Taylor  series,  we  see  that 

p(a^+6)  t 6p'(a^)  , 
p(a2-6)  * -6p'(a2)  , 

for  small  enough  6 > 0.  Thus 

p(a,+6)p(u2-6)  * -62p* )p ' (a2)  < 0 

so  the  p must  have  another  zero  in  [a^,a2l,  contrary  to  assumption. 
The  contradiction  implies  F{a^)F(a2)  £0  and  concludes  the  proof. 


— — smrns 
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2.  No  Complex  Solutions  for  Certain  Real  Polynomials 

Wilkinson's  polynomial  of  chapter  X has  the  property  that  all  its 
zeros  are  real  and  have  the  same  sign.  When  solving  (1.1)  for 
Wilkinson's  polynomial  we  need  not  search  for  complex  zeros  because 
of  the  following. 

Proposition.  Let  p(x)  = n(xm-a.)  be  a complex  polynomial  in  xm. 

j J 

If  all  the  numbers  are  either  zero  or  have  the  same  argument  9 
then  the  non-zero  solutions  c of  (1.1)  may  only  have  arguments 
(0+kn)/m,  0 £ k £ 2m-l . 

Corollary.  If  a real  polynomial  p(x)  = n(x-a.)  has  all  real 

J 

zeros  a.  all  of  the  same  sign,  then  all  its  c's  are  real. 

j 

2 2 

Corollary.  If  an  even  real  polynomial  p(x)  = n(x  -a.)  has  all 

J 

zeros  ±a.  real,  then  all  its  c's  are  either  real  or  pure  imaginary, 

J 

Proof  of  Proposition.  Rewrite  (1.1)  in  the  form  of  (1.3): 

cp‘(c)/p(c)  = R(|c|)  . 

Remember  R is  a real  function  of  |c|  and  0 £ R < n-1.  Suppose 
first  the  special  case  that  all  a.  = 0 so  p(x)  = x11.  Then  (1.1) 

J 

reduces  to  n * R(c),  so  the  only  solution  of  (1.1)  is  the  universal 
solution  c = 0. 

Otherwise  we  may  assume  that  at  least  one  a.  f 0.  Recall  that 

J 

p’U)/p(0  - 


take  imaginary  parts  of  (1.3)  to  find 


0 = Im(cpVp)  , 

0-  M^IykVji2)  , 

0 = Im(?V1S)-I|aj|/|Cm-c.j|2  . 

Since  at  least  one  aj  is  non-zero  the  sum  J of  positive  quantities 
may  not  vanish.  Then  if  0 denotes  the  argument  of  a nonzero  x, 
we  have 

Im(exp(i(m0-6)))  = 0 


from  which  the  result  follows.  n . 

Note  the  two  resulting  equations  for  |c|  are 

r ■ -icrii/(i{iM«.i) 

J 

which  could  be  expressed  as  two  real  polynomials  of  degree  3n  - 2 in 
U|.  However,  for  polynomials  in  im  it  might  be  reasonable  to 
restrict  perturbations  to  polynomials  in  Tm  by  causing  appropriate 
weights  in  the  norm  to  become  infinite.  Then  R(|c|)  becomes  R(|c|m) 
and  the  resulting  polynomials  are  of  degree  (3n-2)/m  in  | c |m. 


jCf 
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3.  Counterexample 

The  previous  proposition  might  lead  one  to  hope  that  polynomials 
with  all  zeros  real  would  not  have  complex  solutions  to  (1.1).  The 
following  counterexample,  produced  by  W.  Kahan,  eliminates  such  hopes: 

Example.  Let  n = 2 and  p(x)  = (t-1)(t+1).  If  2w^  < w ^ 
then  (1.1)  has  a complex  solution 

? = ± i/1  - (2w1/w2)  . 

Comments . Some  other  surprising  facts  may  be  learned  from  this 
one  example.  We  start  by  deriving  all  the  solutions  of  (1.1).  Let 
a)  = (w-jA^)  > 0.  Then  (1.1)  is 

Ul2(c2-1)  - (|;|2+u>)cUc)  * o 


O'  , dividing  by  the  solution  z,  - 0, 


then 


and 


C U | 2 + + C*  = 0 ; 


(Re  ?)(ld  +2«  + l)  • 0 


(Im  c)(|c|'  +2w-l)  * 0 


By  considering  the  various  possibilities  we  conclude  that  the  only 

solutions  of  these  equations  are  just  c = 0 and,  if  oj  < j, 

1 /2 

5 * ±i(l-2uj)  . The  norm  of  the  corresponding  q's  may  be  calculated 

to  be 

2 

Iql  = w,,  , for  a double  zero  at  0, 

Iql 2 = 4u(l-'i)w2  , for  a double  zero  at  ±i(l-2w)^  . 
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1 1 /p 

So  for  0 < w < y . the  global  minima  are  at  = ±i (1  -2oj)  , not  at 

5 = 0.  In  this  case,  0 represents  a saddle  point;  it  is  where  the 
global  minimum  occurs  if  only  real  5 are  considered.  But  on  the 
imaginary  axis,  the  minima  occur  elsewhere,  and  a local  maximum  occurs 
at  5=0  if  only  pure  imaginary  5 are  considered. 

Of  course,  there  are  other  real  polynomials  with  all  zeros  real 
which  have  solutions  of  (1.1)  which  are  complex  but  not  pure  imaginary. 

It  is  perhaps  surprising  that  an  even  real  polynomial  with  some  zeros 
real  and  some  pure  imaginary  may  have  solutions  5 of  (1.1)  which  are 
neither  real  nor  pure  imaginary.  For  instance,  by  appropriate  choice 
of  weights  so  that  the  R(|5|)  of  (1.3)  has  the  value  2 when 
| 5 | * 1,  we  find  that  some  solutions  5 for  the  polynomial 

p(t)  = t4  - 1 

are  5=0  and  5 = (±l±i)/v^.  We  may  further  restrict  the  weights 
so  that  these  are  the  only  5's. 

Returning  to  Kahan's  counterexample,  recall  the  Lucas  theorem: 
the  convex  hull  of  the  zeros  of  a polynomial  contains  all  the  zeros  of 
its  derivative.  The  present  example  shows  that  no  such  simple  state- 
ment may  be  made  about  the  geometrical  relationship  between  the  zeros 
of  a polynomial  and  the  solutions  of  (1.1).  Some  early  experimental 
results  suggested  that  the  convex  hull  of  the  origin  and  the  zeros  of 
the  polynomial  always  contained  the  global  minimum.  But  the  counter- 
example shows  that  this  is  not  always  the  case. 

Yet  the  solutions  of  (1.1)  do  behave  somewhat  like  the  zeros  of 
the  derivative  of  the  corresponding  polynomial.  Consider  these  symmetry 
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Facts: 

1)  If  p is  real  then  F of  (1.1)  is  real; 

2)  if  p is  odd  then  so  is  F; 

3)  if  p is  even  then  so  is  F; 

4)  if  all  the  zeros  of  p are  multiplied  by  a constant  phase 

factor  exp(ie)  then  so  are  the  zeros  of  F.  Thus  there  is  no  essen- 

tial difference  between  a real  polynomial  and  a complex  one  whose 
zeros  are  symmetric  about  a line  through  the  origin. 

In  contrast,  consider  this  invariance  of  polynomials  under  scaling: 
if  the  zeros  of  p are  all  multiplied  by  a scale  factor,  then  all  the 
zeros  of  all  the  derivatives  are  scaled  by  the  same  factor.  But  if 
the  weights  in  the  o's  of  (1.1)  are  regarded  as  fixed,  then  scaling 
the  zeros  of  p does  not  introduce  a corresponding  scaling  of  the 
solutions  of  (1.1),  which  change  in  a complicated  way.  One  could 
regard  the  weights  as  depending  on  the  scaling  factor,  however.  If, 
for  instance, 

wj 1 V(ll2)n"J 

where  c.  is  fixed  and  y is  the  modulus  of  the  zero  of  p of 
largest  modulus,  then  a scaling  change  in  the  zeros  of  p will  produce 
a corresponding  scaling  of  the  solutions  of  (1.1).  One  could  go 
further  and  imagine  that  y s |;|,  a function  of  the  ostensibly 
unknown  ;.  Then  the  o's  are  constant  and  the  F of  (1.1)  takes  an 
especially  simple  form:  it  becomes  a polynomial.  In  some  of  the 

sections  to  follow  this  analytical  "swindle"  will  be  exploited. 
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4.  A Bound  on  the  Solutions  g 

We  will  exploit  Theorem  (17,2a)  of  Marden  [21]  to  bound  the  solu- 
tions of  (1.1).  It  is  not  immediately  obvious  how  large  those  solutions 
might  be,  relative  to  the  zeros  of  the  polynomials. 

Marden' s theorem  concerns  the  location  of  the  zeros  of  a linear 
combination  of  monic  polynomials  of  degree  n.  Let  x(x)-Xy(x)  be 
that  linear  combination,  and  let  C(c,r)  represent  a circle  of  radius 
r centered  at  c.  Cx(cx>rx)  contains  all  the  zeros  of  x and 
C ( c ,r  ) contains  all  the  zeros  of  y.  The  theorem  asserts  that  all 

J J J 

the  zeros  of  x - Xy  lie  in  the  union  of  the  n circles  ^(y^p^) » 

1 £ k £ n,  where 

\ ■ (V“kc«),(,-"k) 

and 

pk  • <yKirx)/i’-«'ki 

and 


The  are  the  n n^  roots  of  1. 

Our  result  is  the  following. 

Corollary.  If  |a  | is  the  maximum  modulus  of  the  zeros  of  p, 
then  all  the  solutions  c of  (1.1)  satisfy 

(4.0  let  < 2n2|amaxl  * 

Proof.  Rewrite  (1.1)  in  a form  appropriate  to  the  theorem: 

gr(c)  ■ (^-1  - (i)p(c)  • 0 . 

Then  if  R is  held  fixed,  GR  is  in  the  proper  form.  Let  the  circles 
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C and  C be  crudely  approximated  by  C(0,|ct  |).  This  circle 

X y maX 

certainly  contains  all  the  zeros  of  p,  and  hence  of  p',  as  well 
as  0.  Then  = 0 so  the  circles  of  the  theorem  are  concen- 
tric and  only  the  radius  of  the  largest  matters: 


pk 


1 + nvWn~  . 

|1  -rVR7n  ek|  amax 


Remembering  that  0 £ R < n-1 , it  is  clear  that 


P 


1 


1 +n^7n  . 

1 -nv^7F  ^ 


- , n 


1 - V(n-1  )/n 


a 


max1 


i 2"  I “max  * ' 


Since  any  solution  of  (1.1)  is  a zero  of  GR  for  some  positive 
R < n-1,  the  bound  is  valid  for  all  such  solutions.  Q.E.D. 


The  purpose  of  this  crude  estimate  is  just  to  show  that  the  solu- 
tions of  (1.1)  are  bounded.  The  gross  approximations  involved  might 
lead  one  to  doubt  that  the  bound  is  realistic,  and  indeed  for  "normal" 

polynomials  the  solutions  do  not.  seem  to  exceed  la  |. 

max1 

However  Wilkinson’s  polynomial  of  degree  20,  discussed  in 
chapter  X,  has  a solution  for  (1.1)  at  -117.31;  the  norm  has 
Wj  * l/|Pj|  which  minimizes  relative  changes  in  the  coefficients. 

In  this  case  k J exceeds  loti  by  a factor  of  nearly  5. 
Presumably  by  appropriate  choice  of  norm  that  factor  could  be  made 
even  larger  — how  much  larger  is  unknown. 


One  might  consider  a type  of  iteration  scheme:  since  the  bound 

(4.1)  depends  heavily  on  the  maximum  value  of  R,  which  we  bounded  by 

n-1 , any  knowledge  that  reduces  that  R „ should  affect  the  bound 

max 

appreciably.  But  R is  monotonic  in  | ?!  so  Rm=v  depends  on  the 

JndX 

bound  on  j?|,  which  is  in  turn  dependent  on  R . Clearly  we  could 

max 

reduce  the  bounds  on  )?)  and  R alternatingly.  Unfortunately  in 

■ • id  a 

practice  such  an  iteration  seems  to  improve  the  bound  so  little  as  to 
be  scarcely  worth  the  trouble. 
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5.  Propositions  for  Real  Quadratic  Polynomials 

The  example  of  section  3 was  a counter  to  a tempting,  but  incor- 
rect assertion.  That  same  example  could  be  regarded  positively,  how- 
ever, as  an  example  of  the  propositions  of  the  present  section. 

Proposition  5.1 . Consider  a real  monic  quadratic  polynomial 

2 

p(r)  = t - 2ar  + y . 

Let  y be  the  modulus  of  its  largest  zero.  Then  every  solution  s 
of  (1.1)  satisfies  |c|  £ p. 

Proof.  By  examination  of  cases.  Equation  (1.1)  may  be  written 

(^-Zas+YMUI2/^)  - ;(2c-2a)(|c|2/Wl +l/w2)  = 0 . 

Factor  out  c to  remove  the  uninteresting  solution  s = 0;  then 
letting  gj  = (w.j/w2)  > 0,  and  taking  real  and  imaginary  parts  leaves 
the  equations 

(5.1)  |s|2Re£+  (2o)-y)Re  £ - 2oud  * 0 , 

(5.2)  1 2 Im  C + (2u+r)Im  C = 0 . 

2 

The  second  of  these  equations  is  satisfied  if  j;|  = -(2w+y)  or 

Im  ; * 0,  providing  two  cases. 

In  the  first  of  these  cases  y < 0 so  the  zeros  of  p are  real 
and 

y = |a|  + (a2-y)^2  . 


But  we  may  easily  verify  that 
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I?|2  = -(2oj+y)  £ y2 

as  claimed. 

In  ? * 0 in  the  second  case  so  the  solutions  5 are  just  the 
real  solutions  of  (5.1),  which  satisfy 

(5.3)  g(?)  = s3+  (2ii>-Y)s  - 2aw  = 0 . 

g may  have  complex  solutions  but  these  do  not  satisfy  (1.1). 

We  will  prove  the  proposition  by  showing  that  g(-y)  < 0, 

g(+y)  > 0,  and  the  real  critical  points  where  g'(c)  vanishes  are 

contained  in  [-y,+y].  Thus  the  real  zeros  of  g are  bracketed  in 

[-y,+y]  whether  they  be  1,  2,  or  3 in  number.  The  details,  however, 

depend  on  whether  the  zeros  of  p are  real  or  complex. 

2 

Suppose  first  that  a < y so  the  zeros  of  p are  complex  and 
y = y1/2.  Then 

g(-y)  * -2u)(Y^2+a)  < 0 
and 

g(+y)  * +2to(Y^2-a)  > 0 . 

Furthermore  the  zeros  of  g'  are  ♦((y^u))^)^2.  When  these  zeros 

1/2 

are  real  they  are  less  than  y in  modulus  since  to  > 0. 

2 

Now  suppose  that  a >y  so  the  zeros  of  p are  real  and 

y * jot|  + (cx2-y)^2  . 

2 

g(-y)  = -y(y  +2w-y)  - 2ato 
g(+y)  = +u(u2+2u-y)  - 2ato  . 


Then 
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It  Is  easy  to  verify  that 

2 

\i  + 2u  - y > 0 

and 

| p(u2+2o>-y)  | > |2oujj 

so  g(-y)  < 0 and  g{+y)  > 0.  And  finally  we  may  verify  that  when 
g'  has  real  zeros  ± ( (y-2aj)/3)  ' , they  do  not  exceed  y in  magni- 
tude. Q.E.D. 

Our  next  result  is  in  a similar  vein. 

Proposition  5.2.  Consider  a real  monic  quadratic  polynomial 

2 

p(r)  = t - 2ax  + y . 

Then  there  is  a solution  ; of  (1.1)  in  the  smallest  circle  containing 
both  zeros  of  p. 

Proof.  The  zeros  of  p are  a±(a  -y)  and  the  smallest 
circle  containing  them  has  center  a and  radius  |o2-y|^2.  Therefore 
the  assertion  is  that  there  is  a solution  ; such  that 

k-a|  « |oZ-Yl1/Z  . 

The  solution  ? * 0 satisfies  the  proposition  if  y <0  or  y > 2a2, 

2 

so  assume  henceforth  that  0 < y < 2a  . 

Recalling  equations  (5.1)  and  (5.2),  we  find  that  the  only  remain- 
ing solutions  are  the  real  solutions  of 

g(0  = c3+  (2u-y);  - 2cw  * 0 . 


Thus  we  must  show  that  there  is  t solution  5 in  [n,0]  where 

, 2 ,1/2  - . , 2 ,1/2 

n = a - I a -y I , 6 = a + I a -y 1 . 

We  do  so  by  demonstrating  that  g(n)*g(e)  ^.0. 

Now 

g(n)g(e)  = a2(3|a2-Y|  + a2  - y)2  - |a2-Y|(|a2-Y|  + 3a2  - y+  2w)2  . 

2 

Suppose  first  that  a >_  y.  Then 

g(n)g(9)  = -4(a2-Y)(Y2  + w2  + 2w(2a2-Y))  . 

But  the  last  factor  is  easily  seen  to  be  positive. 

? 

Suppose  that  a < y.  Then 

g(n)g(e)  = -4(Y-a2)(ijj2  + 2a2(jj  + a2(2a2-Y))  . 

2 

But  at  the  outset  we  restricted  y < 2a  . Q.E.D. 

This  last  proposition  might  leod  one  to  suppose  that  for  any  poly- 
nomial p of  degree  n >_  2,  equation  (1.1)  has  a solution  in  the 
smallest  circle  containing  two  zeros  of  p.  In  section  7 this  suppo- 
sition will  be  shown  to  be  incorrect,  and  a weaker  conjecture  will  be 
proposed. 


6.  Swindle  Results  for  Real  Quadratic  Polynomials 

A method  for  evading  certain  problems  arising  from  the  n>n- 
analyticity  of  (1.1)  was  briefly  mentioned  in  section  3.  Namely,  each 
weight  in  the  norm  was  defined  to  be 

Wj  . CjkV-J  • 

Thus  oQ  and  are  constant  and  therefore  so  is  R of  (1.3). 

This  amounts  to  an  analytical  swindle  since  the  dependence  of  the  w. 
on  5 was  not  incorporated  into  the  derivation  of  (1.1).  None  the 
less  any  solution  of  (1.1)  is  also  a solution  of 

(6.1)  sU)  = SP'U)  -RpU)  ■ 0 

for  some  fixed  R;  the  R depends  on  |c|  in  general,  but  not  in  the 
swindle  case.  In  either  case  0 < R < n-1. 

It  is  useful  to  study  the  solutions  of  (6.1)  for  fixed  R to  see 
what  light  they  shed  on  the  original  problem. 

We  start  by  noting  that  (6.1)  has  a solution  c * 0 only  if 
p(0)  * 0.  So  the  part  of  the  previous  theory  that  depends  on  a solu- 
tion at  c * 0 may  not  necessarily  be  true. 

Write  the  quadratic  p as 

2 

p(f)  * t - 2at  + y 

so  o is  the  arithmetic  mean  of  the  zeros  of  p and  y is  their 
product.  Then  the  zeros  of  s are 


/1-R\  . /1-R\2  2i(R, 

c e (jtr'01  * a + . 
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In  the  limiting  case  R -*•  0,  the  c's  approach  a and  0.  In  con- 

1/2 

trast,  as  R -*•  1 the  ;'s  approacn  ±y  . So,  in  particular,  if 
Y < 0,  corresponding  to  the  zeros  of  p being  real  and  opposite  in 
sign,  then  in  the  second  limit  the  zeros  are  pure  imaginary.  This 
situation  corresponds  to  the  counterexample  of  section  3. 

Two  results  from  the  previous  section  that  the  limiting  cases 
support  are  that  1)  the  magnitude  of  the  c's  does  not  exceed  that  of 
the  larger  zero  of  p,  and  2)  there  is  always  one  c in  the  smallest 
circle  containing  both  zeros  of  the  quadratic  p.  These  are  correct 
inferences. 

Proposition  6.1.  Let  i be  any  solution  of  (6,1)  when  p is  a 
real  quadratic  polynomial.  Then  |c|  does  not  exceed  the  magnitude 
of  the  larger  zero  of  p. 

Proof.  Consider  four  cases:  the  zeros  of  p are  equal;  the  zeros 
of  p are  complex;  the  zeros  of  p are  real  as  are  the  c;  the  zeros 

of  p are  real  but  the  5 are  complex.  The  first  case  is  trivial 

and  the  other  three  cases  are  similar  in  proof.  For  the  last  case,  for 

instance,  we  have 

(1-R)2ct2  + R(2-R)y  < 0 and  a2  > y . 

Obviously  y < 0.  We  wish  to  compare  |c|  with  y,  the  modulus  of 
the  larger  zero  of  p: 


Id  * , 


Thus 
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y2  - |cl2  = 2a2  + 2|a|v4^y  - 2y(^-) 

which  is  a sum  of  non-negative  terms,  since  y < 0 and  R < 1.  The 
last  term  is  positive  so  \z\  < y.  Q.E.D. 

Proposition  6.2.  The  smallest  circle  containing  both  zeros  of  a 
real  quadratic  p contains  a solution  of  (6.1). 

Proof.  As  in  the  previous  proposition  there  are  four  cases. 

Below  we  sketch  the  proof  of  the  case  in  which  both  zeros  of  p are 
2 

complex.  Then  a < y,  y > 0,  and  both  £'s  are  real.  We  wish  to 

2 1/2 

show  that  Is-a)  _<  (y-a  ) ' for  one  of  the  s's. 

Now 

? - a = ± A 

where 

. _ ,1-R\2  2a  R 
A = (^  a + 2^R  Y * 

Then 

I C-al  - v 2TTr)  a + a + (jr^Y  + (^r)0^ 
and  we  want  to  show  that  for  either  + or 

2 

+ aA^2  < (l-R)(y-a2) . 

If  we  choose  the  sign  that  makes  +a  negative  we  find  that  the  last 

2 

inequality  is  equivalent  to  y > a , which  is  what  we  assumed. 

The  proofs  of  the  other  cases  are  similar.  Q.E.D. 

As  a tool  for  analysis  the  swindle  does  not  seem  to  help  much  in 


the  quadratic  case.  All  of  the  propositions  about  quadratics  are 
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proved  just  as  easily  withi 
. . 1L  never  the  1 

0 Ven'fy  the  t^ones  in  the  geadratic 

S,"Ce  U ’S  diff’'CU,t  t0  resets  to  hi9her  degrees 

the  swindle. 


ess  help- 
case, 
without 
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7.  The  Smallest  Circle  Containing  Two  Zeros  Need  Not  Contain  a c 

In  sections  1 and  5 we  learned  that  1)  there  is  a real  c between 
any  two  real  zeros  of  a real  polynomial  p,  2)  a corresponding  result 
holds  for  complex  polynomials  symmetric  about  a line  through  the  origin, 
and  3)  the  smallest  circle  containing  the  two  zeros  of  a real  quadratic 
polynomial  contains  a £.  Furthermore,  when  a is  a complex  zero  of 
a real  p with  J Re  aj  < ]Im  a|,  then  c = 0 is  contained  in  the 
smallest  circle  containing  a and  its  conjugate.  In  section  6 we  will 
see  that  when  a polynomial  with  a double  zero  is  subjected  to  a small 
perturbation  causing  the  double  zero  to  split,  the  smallest  circle 
containing  the  split  zeros  contains  a From  these  facts  we  might 
conclude  that  the  smallest  circle  containing  two  zeros  of  any  polyno- 
mial p contains  a £. 

This  conclusion  is  supported  by  all  the  experimental  results 
reported  in  chapters  IX  and  X,  using  norms  which  measure  absolute  or 
relative  changes  in  the  coefficients  of  p.  But  an  investigation  to 
settle  this  specific  question  turned  up  a counterexample,  given  below, 
and  led  to  a further  conjecture  which  is  not  yet  resolved. 

The  counterexample  was  discovered  by  computationally  exploiting 
the  analytic  swindle  described  in  sectidn  6.  A crude  optimization 
program  varied  the  zeros  of  a real  cubic  polynomial  and  the  fixed 
constant  R in  order  to  make  the  c's  lie  as  far  as  possible  from  the 
center  of  the  smallest  circle  containing  the  two  complex  zeros  of  the 
polynomial.  A polynomial  p(t)  was  found  with  zeros  a at  1.0  and 
.224  ± .174i.  When  R = 1.937  the  zeros  of  s(i)r  as  in  equation 
(6.1),  were  -.830  and  .424  ± .099i;  see  Figure  VI. 1.  Thus  the 
complex  c's  are  just  outside  the  circle  containing  the  complex  zeros  a. 
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The  swindle  was  used  because  the  polynomial  equation  s(t)  = 0 
may  be  solved  equickly.  Our  real  interest,  of  course,  is  in  finding 
an  example  without  using  the  swindle.  So  another  crude  optimization 
program  was  run  with  p(x)  fixed  but  with  the  norm  weights  allowed  to 
vary  in  such  a way  that  s = . 424  ± . 1 74i  remained  a solution  of  (1.1). 
Surprisingly  enough,  the  program  quickly  converged  to  a suitable 
counterexample:  Let  the  weights  be  1,  1000,  and  10000,  Then 

(1.1)  has  no  solutions  inside  the  smallest  circle  containing  the  u's 
. 224  ± . 1 74i . The  closest  z's  are  at  .4245  ±.0993i  and  0.  See 
Figure  VI. 2. 

Thus  we  must  discard  the  conjecture  that  the  smallest  circle 
containing  two  zeros  of  a polynomial  contains  a £.  That  should  come 
as  no  surprise,  however,  for  the  corresponding  conjecture  about  deriva- 
tives is  not  true  either:  the  smallest  circle  containing  two  zeros  of 

a polynomial  need  not  contain  a zero  of  the  derivative.  Rather  the 
following  is  known: 

Proposition.  Let  a circle  of  radius  p contain  m zeros  of  a 
polynomial  p of  degree  n.  Then  there  is  a zero  of  the  m-lth 

derivative  of  p in  the  concentric  circle  of  radius 

p csc((-rr/2)/(n+l-m))  . 

This  proposition  is  stated  in  a stronger  form  and  proved  by 
Kahan  [17].  The  proposition  suggests  the  following  revised 

Conjecture.  Let  a circle  of  radius  p contain  m zeros  of  a 
polynomial  p of  degree  n.  Then  there  is  a solution  of  the  appro- 

priate equation  for  the  nearest  polynomial  with  an  m-tuple  zero  within 


! 


;i 

i 

i 

i 

i 


. 


i * 

t 


4 


1 


the  concentric  circle  of  radius 

p csc((ir/2)/(n+l-m))  . 

Thus  real  cubic  polynomials  that  have  a complex  conjugate  pair  of 
zeros  a should  have  a solution  c for  a double  zero  such  that 
|c-Rea|  < »^|Im  a|.  None  of  the  examples  we  have  encountered  or 
constructed  have  violated  this  revised  conjecture. 
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8.  Infinitesimal  Location  Theory 

This  section  provides  a bridge  between  the  location  theory  of 
previous  sections  and  the  perturbation  theory  of  the  next  chapter.  In 
this  section  we  seek  to  answer  the  question:  "Where  do  the  solutions 

S of  (1.1)  go  when  a polynomial  with  a double  zero  is  perturbed 
infinitesimally?" 

Recall  that  if  a is  a double  zero  of  a polynomial  p then  it 
is  a solution  of  equations  (1.1)  and  (6.1)  --  as  would  be  expected, 
since  a place  where  no  perturbation  is  required  to  get  a double  zero 
is  obviously  a critical  point  for  norms  of  such  perturbations.  Most 
perturbations  of  a polynomial  with  a multiple  zero  will  break  that 
multiple  zero  into  ill  conditioned  simple  zeros,  but  we  shall  see  that 
the  solution  of  (1.1)  only  moves  in  a well  conditioned  manner  when 
subject  to  such  a perturbation. 

Let 

P(t)  = (T-a)2q(i)  , q(a)  f 0 , 

be  our  starting  polynomial  with  a double  zero  and  a solution  of  (1.1) 
at  a.  Let 


p(i)  = p(t)  + 6ch(t)  , h(a)  f 0 , 

be  p subject  to  a perturbation  which  is  a linear  function  of  5c. 
Also  a + 6a  will  represent  a zero  of  p perturbed  from  a.  Then 
expanding  in  Taylor  series, 

0 = p(a+5a)  = ^(5a)2p"(ci)  + 5e(h(a)  + 5ah'  (a))  . 
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Simplifying,  we  find 

(8.1)  6a  - ± ((-h(a)/q(a))6e)ly/2  , 

the  classical  result  that  a double  zero  tends  to  divide  into  two  simple 
zeros  according  to  a fractional  power  of  the  perturbation, 
a is  also  a zero  of 

f(c)  = R(c)pU)  - CP'U)  • 

Let  a + fic  be  the  perturbed  solution  when  p is  perturbed  to  p.  We 
wish  to  find  a Taylor  series  expansion  for  6c  in  terms  of  fie.  R is 
not  analytic  in  c,  so  we  must  use  the  fact  that  it  is  an  analytic 
real  function  of  the  real  variables  Re  c and  Im  c.  Eventually  we 
find  that 

(8.2)  6C  = { (R(a)h(a)-ah' (a))/(2aq(a))}fie  + 0(fie2) 

provided  a t 0 

and  R(a)h(a)  - ah' (a)  t 0 . 

The  last  condition  represents  a kind  of  "orthogonal"  perturbation  h 
which  does  not  affect  the  solution  c of  (1.1)  to  first  order. 

Comparing  (8.2)  and  (3.1)  we  see  that  for  a typical  perturbation 
h,  the  zeros  of  p move  away  from  a much  faster  than  the  zero  of  f. 
Since  those  ill  conditioned  zeros  of  p are  moving  in  opposite  direc- 
tions, the  smallest  circle  containing  them  will  also  contain  a solu- 
tion of  (1.1)  whenever  p is  close  enough  to  the  manifold  of  polyno- 
mials with  double  zeros. 
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For  further  comparison,  consider  the  change  in  the  zero  of  the 
derivative  of  p.  If  a + 66  denotes  the  zero  of  p',  we  find  that 

60  = (-h‘ (a)/2q(a)) 6e 

provided  h'(a)  t 0.  So  the  zero  of  the  derivative  also  changes 
linearly  with  6e.  I£  (R(a)h(a)/ah‘ (a))  is  sufficiently  small  -- 
as  must  occur  if  a is  sufficiently  close  to  zero  — then  6;  and 
60  are  nearly  the  same.  Unfortunately  6?  and  66  are  quite 
different  in  general  so  50  may  not  serve  well  as  an  estimate  of  6£. 


CHAPTER  VII 


PERTURBATION  THEORY  FOR  MULTIPLE  ZEROS  OF  POLYNOMIALS 
1 . Introduction 

In  this  chapter  we  will  recall  the  standard  theory  of  perturba- 
tions of  multiple  zeros  of  polynomials,  discern  its  limitations,  and 
propose  a more  satisfactory  theory  which  reflects  the  insights  gained 
from  the  research  described  in  previous  chapters. 

To  recall  the  classical  theory,  start  with  a polynomial  with 
multiple  zero  a: 

p(r)  = (T-a)mq(x)  , q(a)  / 0 . 

The  condition  q(a)  t 0 means  that  the  multiplicity  of  a is  pre- 
cisely m.  We  wish  to  see  how  an  arbitrary  perturbation  of  p affects 
a.  In  general  a will  tend  to  split  up  into  m distinct  zeros. 

Apply  a perturbing  polynomial  er(x)  of  degree  at  most  n-1  to 
get 

P (x ) * (x-a)mq(x ) +cr(x)  . 

If  (x-a)m  divided  r(x)  then  the  problem  would  be  uninteresting 
since  the  m-tuple  zero  a would  retain  its  identity  regardless  of  the 
perturbation.  Similarly  if  (x-a)  divided  r(x),  1 k^m-1, 

then  the  k-tuple  zero  a would  persist  after  perturbation  and  the 
only  interesting  problem  would  be  the  fate  of  the  zeros  of 

n_  l k 

(x-a)  q(x)  ♦ (r(x)/(x-a)  ).  Thus  we  may  assume  without  loss  of 
generality  that  (x-a)  does  not  divide  r(x),  i.e.  r(a)  f 0. 

For  our  purposes  the  degree  of  p is  presumed  to  be  known  and 
fixed.  Since  we  are  only  interested  in  the  zeros  of  p,  there  is  no 
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essential  loss  of  generality  in  restricting  the  degree  of  r to  be  no 
greater  than  n-1,  because  a small  perturbation  er  of  degree  n 
would  be  equivalent  to  some  other  small  perturbation  er  of  smaller 
degree. 

Let  a + n represent  a zero  of  the  perturbed  polynomial  p: 

(1.1)  p (cx+n)  = 0 = nmq ( cx+n ) + er(a+n)  . 

Thus 

e = nm[-q(at+n)/r(cH-n)]  . 

However  our  interest  is  in  expressing  n in  terms  of  e.  Since  r 
and  q are  polynomials  they  may  be  expanded  easily  in  a Taylor  series 
about  a;  thus 

e = -nm[q(a)/r(a)]  + higher  order  terms  . 

Then 

n * [ (-r(cx)/q(Qi))e] 1 + higher  order  terms  . 

The  m different  m^  roots  define  the  different  perturbations  n 
corresponding  to  the  m zeros  of  p derived  from  the  m-tuple  zero  a 
of  p. 

Thus  we  seem  to  have  a series  in  fractional  powers  of  e when 
m > 1.  In  the  next  section  we  will  indicate  a rigorous  justification 
for  this  result  and  explain  a constructive  method  for  the  higher  order 
terms. 

Our  overall  goal  is  to  find  series  that  converge  rapidly,  since 
we  do  not  wart  to  calculate  more  than  one  or  two  terms.  Consequently 
we  want  series  that  converge  over  the  largest  possible  region  so  that 
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convergence  w,1,  be  fast  in  the  region  of  interest.  If  the  region  of 
convergence  is  not  much  larger  than  the  region  of  interest,  convergence 
Is  so  slow  there  that  the  series  "fails"  in  the  sense  that  it  is  not 
practically  useful.  A worse  failure  arises  when  the  region  of  conver- 
gence  does  not  contain  all  of  the  region  of  interest. 


2.  Classical  Theory  of  Expansions  of  Algebraic  Functions 
In  the  previous  section  we  indicated  how  to  solve 
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(2.1)  f(e,n)  = nmq(a+n)  + er(a+n)  = 0 , 


subject  to 


(2.2) 


r 


< 


v. 


deg  q = n - m < n 
deg  r < n - 1 , 
r(a)  f 0 , 
q(a)  t 0 , 


* 


for  n in  terms  of  a series  in  fractional  powers  of  e.  Now  we  will 
cite  the  classical  results  which  justify  our  approach  and  explain  how 
to  construct  that  series. 

f(e,n)  * 0 is  an  example  of  an  algebraic  equation  defining  alge- 
braic functions  e or  n in  terms  of  the  other.  It  is  easy  to  get 
e as  a function  of  n;  our  goal  is  to  construct  n as  a function  of 
e.  We  will  recall  certain  results  from  standard  texts,  changing  the 
notation  to  suit  our  problem,  and  omitting  hypotheses  which  duplicate 
our  assumptions  (2.2). 

The  first  result  is 


Weierstrass'  Preparation  Theorem  [22,  p.  105]:  There  is  a 
neighborhood 


e|  < o1  , jni  < o2  , 


such  that 


T(c,n)  * (E0(c) + E1(c)n  + +Em-1(c)nm‘1 +nm}g(c,n) 
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for  functions  , . . . ,Em_^  which  are  analytic  in  that  neighborhood 

and  g which  is  analytic  and  never  vanishes  in  that  neighborhood. 

E0(n)  = E,(0)  = ...  -Em,,(0)  = °. 

Expansions  of  Simple  Zeros 

Consider  first  the  case  of  expansions  of  a simple  zero.  The  next 
result  is  a consequence  of  the  preparation  theorem: 

Implicit  Function  Theorem  [22,  p.  109]:  When  m = 1,  then  there 

is  a neighborhood 

|e|  < P-j  > | n | < p2  , 

such  that  f(e,n)  = 0 has  a unique  root  n = n(e)  for  any  e in  the 
neighborhood.  n{e)  is  single  valued  and  analytic  in  the  neighborhood 
and  n(0)  = 0. 

In  other  words,  in  the  vicinity  of  a simple  zero  a,  n may  be 
expressed  as  a Taylor  series  in  e.  The  theorem  says  nothing  about 
the  size  of  that  vicinity  --  it  may  be  quite  small. 

If  all  the  zeros  of  p are  simple,  then  there  is  a neighborhood 
in  which  the  n zeros  of  p(t )+er(x)  are  all  simple  and  they  may  be 
expressed  as  n Taylor  series  in  e,  defining  n analytic  functions 
of  e. 

Given  a function  p(e)  defined  by  the  polynomial  equation 
f(e,n)  =0,  a singular  point  may  be  defined  for  our  purpose  as 
one  for  which  the  discriminant  of  f(e0,n)  vanishes.  The  discriminant 
of  a polynomial  with  n zeros  .c^,. . . ,an  may  be  defined  [10,  p.  115] 


to  be 


171 


d{€)  = n (ol  -a  )2  . 

l<i<j<n  1 J 

0 is  a function  of  e because  the  zeros  a..  are.  D(e)  may  also  be 
expressed  [12,  p.  39]  as  a polynomial  in  the  n-coefficients  of  f(c,n). 

Then  at  a singular  point  £g,  p(x)  + £gr(x)  has  at  least  one 
multiple  zero.  Bliss  [1,  p.  29]  shows  that  the  radii  of  convergence 
of  the  n Taylor  series  for  perturbed  sinple  zeros  are  at  least  as 
large  as  the  distance  to  the  nearest  singular  point.  Thus  when  per- 
turbing p(t) , with  all  zeros  simple,  in  the  direction  r(x),  the 
expansions  in  powers  of  e converge  for  jej  at  least  as  large  as 
|eq|  in  the  nearest  polynomial  p(x)+£gr(x)  on  the  manifold  of 
polynomials  with  double  zeros.  When  p and  r are  real  we  must 
remember  that  complex  e must  be  considered  when  computing  radii  of 
convergence. 

It  is  usually  the  case,  moreover,  that  the  radius  of  convergence 
is  exactly  the  least  je|  such  that  p(x)+er(x)  has  a double  zero. 

Of  course  if  p and  r have  some  zero  in  corcmon  then  the  "series" 
for  that  zero  will  converge  everywhere.  But  in  the  usual  case  when 
the  zeros  of  p and  r are  distinct,  the  Taylor  series  which 
coalesce  to  a multiple  zero  of  p + Egr  can  not  converge  for 

M > |e0|. 

Expansions  from  a Singular  Point 

What  if  we  start  from  a singular  poi^t,  where  p(x)  has  a multi- 
ple zero?  The  answer  is  contained  in 
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Puiseux's  Theorem  [10,  p.  118]:  Let  m > 1 in  (2.1).  Then 

there  is  a neighborhood 


i£!  < p1  » In!  < d2  > 

and  an  integer  k such  that  n is  an  analytic  function  of  6,  where 

k 

0 = e.  The  k values  of  6 determine  k analytic  functions. 

Since  we  require  that  r(a)  t 0 we  will  find  that  there  are 
k = m distinct  branches,  defining  n Puiseux  fractional  power  series. 
As  before,  the  radius  of  convergence  depends  on  the  distance  to  the 
next  singular  point  in  any  of  the  directions  er  as  e takes  on 
complex  values. 

Newton's  polygons  may  be  used  to  transform  f into  a form  from 
which  it  is  convenient  to  construct  the  actual  expansions.  For  details 
the  curious  may  consult  Bliss  [1,  p.  35]  or  Kung  and  Traub  [40]  for  a 
modern  algorithmic  account;  the  process  involves  expanding  f(e,n)  in 
a Taylor  series  in  both  variables  e and  n,  and  then  plotting  points 
corresponding  to  the  terms  with  non-zero  coefficients.  Thus 

(2.3)  f(e,n)  = q(a)e°nm  + r(a)c"'n°  + other  terms  . 

Because  our  discussion  is  based  on  the  constraints  (2.2)  the  Newton 
polygon  has  the  especially  simple  form  shown  in  Figure  VII. 1.  Bliss 
shows  how  to  use  the  Newton  polygon  to  discover  the  substitutions 

e = 9m  and  n = 04> 


which  transform  (2.1)  to 


(2.4) 


(^(a+Sq)  + r(a+eq)  = 0 . 
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Bliss  shows  that  all  the  expansions  of  interest  are  obtained  from 
(2.4),  which  may  be  solved  easily  by  the  method  of  substitution  or  by 
faster  methods  [40]  to  express  <j>  as  a Taylor  series  in  0. 

Define 

x(x)  = -r(T)/q(T) 

and  suppose 

4>  = A + B0  + C02  + O(03)  ; 

then  we  find  that 


Am  = x(a)  , 

B = (A2/m)(x'(a)/x(a))  , 

C = (A3/2m){^C^U  (— . 
x(a)  v m x(a)  7 

It  does  not  matter  whether  we  use  one  value  of  A and  m values  of 
0 or  vice  versa.  Higher  order  terms  are  tedious  to  derive  for 
general  m. 

For  m * 1 the  expressions  become 

n * Ac  + Be2  + Ce3  + O(e^) 

where 

{A  = x(a)  ; 

B * Ax' (a)  ; 

C « A((x'(a))2  + ^x»  . 

For  m = 2,  however, 

n = Ac1/2  + Be  + Cc3/2  + 0(c2) 


where 
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= x(a)  , 

■ I*' (a)  , 

1 ^(x"(a)  + U'(a))2/(2x(a)))  . 
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3.  Failure  of  Classical  Taylor  and  Puiseux  Series  Expansions 

o 

Suppose  we  consider  perturbing  the  quadratic  polynomial  (t-1) 

2 

in  the  direction  toward  (t-0)  , i.e. 

p(i)  = (t-1)2 + e(2t-1)  . 


Then  the  zeros  of  p are 
1 - e 


1/2  1 12 

We  could  expand  (1-e)  in  a Taylor  series  1 •••  » yield- 

ing Puiseux  fractional  power  series  for  the  zeros;  those  series  can 

not  converge  outside  a circle  of  radius  equal  to  the  distance  to  the 

1/2 

nearest  singularity  of  (1-e)  1 . That  singularity  is  the  branch 
point  at  e = 1. 

Thus  when  we  consider  perturbations  of  p from  one  point  on  the 
manifold  of  quadratic  polynomials  with  a double  zero  toward  another 
point  on  that  manifold,  the  fractional  power  series  expansions  of  the 
perturbed  double  zero  fail  to  converge  rapidly  as  that  manifold  is 
approached.  The  same  slow  convergence  occurs  whenever  we  attempt 
expansions  from  one  point  on  the  manifold  toward  another  point  on  the 
manifold.  For  practical  purposes,  a power  series  that  converges 
slowly  is  worth  little  more  than  one  that  does  not  converge  at  all. 

Figure  VI I. 2 represents  the  space  of  monic  real  quadratic  poly- 
nomials. Each  point  in  the  plane  corresponds  to  such  a polynomial. 
The  coordinates  of  a point  corresponding  to 

2 

p(t  ) = t + p^  + P2 


are  the  coefficients  p^  and  p^.  The  curve  is  the  nanifold  of 


tangent  line 


The  zeros  of  polynomials  in  the  shaded 
region  may  be  represented  by  convergent 
Puiseux  fractional  power  series  from  *. 

The  zeros  of  polynomials  on  the  tangent 
line  may  be  represented  by  convergent 
finite  integral  power  series  from  *. 


2 

polynomials  with  double  zeros;  its  equation  is  = 4p^. 

The  * marks  the  polynomial  p(x)  = (t-1)  whose  coordinates 

are  p^  = -2,  p^  = 1.  We  can  imagine  perturbing  p to  any  other 

polynomial  p in  the  space;  then  we  may  ask:  can  the  zeros  of  p 

be  obtained  from  the  zeros  of  p by  convergent  Puiseux  fractional 

power  series  in  e(p-p)?  The  shaded  region  in  Figure  VII, 2 is  the 

region  of  points  p for  which  those  fractional  power  series  do  con- 

verge.  That  region  is  bounded  by  the  union  of  the  parabola  p^  = 4p^ 

2 

and  another  parabola,  p^  + 8p^  + 8 = -4p£,  which  is  congruent  and 
osculatory  to  the  first.  Puiseux  fractional  power  series  expansions 
from  * will  not  converge  to  any  point  outside  the  shaded  region. 

The  shaded  regions  were  determined  by  considering  real  perturbations 
in  real  directions;  that  turns  out  to  be  sufficient  for  this  special 
case  of  a real  quadratic  with  a double  zero.  For  more  general  poly- 
nomials it  would  also  be  necessary  to  consider  complex  perturbations 
in  order  to  properly  delimit  the  shaded  region. 

What  happens  on  the  indicated  line  tangent  to  the  manifold  at  *? 
That  line  represents  polynomials  one  of  whose  zeros  is  always  1.  Then 
the  appropriate  "expansions"  for  the  two  zeros  of 


when  perturbed  in  the  direction 

T + PT  - P - 1 

are  1 and  l-e(p+2).  This  finite  expansion  converges  everywhere 
on  the  tangent  line. 
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Notice  that  there  are  polynomials  arbitrarily  close  to  * such 
as 

(t-1)2  - 6(t  + ^5  - 1) 

whose  zeros  can  not  be  represented  by  convergent  Puiseux  fractional 
power  series  from  *. 

In  contrast  to  the  case  of  starting  on  the  manifold,  suppose  now 
that  we  start  off  it,  but  near  it.  Then  the  regions  where  convergence 
of  conventional  Taylor  series  may  occur  are  circumscribed  indeed;  see 
Figure  VII. 3 for  examples. 

In  conclusion,  we  see  that  the  classical  Taylor  and  Puiseux 
series  approaches  for  expressing  changes  of  zeros  in  terns  of  a para- 
meter of  the  perturbations  is  limited  in  applicability  since  neither 
series  will  converge  beyond  the  nearest  singularity  of  the  function 
they  represent.  In  our  case  singularities  amount  to  double  zeros.  In 
the  next  section  we  will  see  how  to  alleviate  this  problem. 
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4.  Why  Find  the  Nearest  Polynomial  with  a Multiple  Zero? 

Suppose  that  the  output  of  a physical  system  may  be  modeled  by 
the  zeros  of  a polynomial . p whose  somewhat  uncertain  coefficients 
may  be  computed  from  experimental  data.  Suppose  furthermore  that 
polynomials  with  multiple  zeros  lie  within  the  region  of  uncertainty. 

We  may  desire  to  determine  how  the  zeros  of  the  polynomial  can 
vary  as  the  coefficients  vary  within  their  uncertainty.  A natural  way 
to  do  this  is  with  a Taylor  series  expansion  of  the  type  described  in 
section  2,  but  such  an  approach  is  doomed  to  fail  when  p is  near  a 
pejorative  manifold.  Such  expansions  are  not  valid  across  the  mani- 
folds of  polynomials  with  multiple  zeros.  Thus  we  can  not  study  the 
variation  of  the  zeros  of  p subject  to  all  perturbations  that 
interest  us  if  the  ball  representing  our  uncertainty  intersects  a 
manifold.  Furthermore  the  convergence  rate  of  the  expansions  we  do 
have  becomes  unacceptable  as  they  approach  their  radius  of  convergence. 
Thus  we  would  like  to  find  an  expansion  process  that  is  convergent  in 
a ball  that  is  much  larger  than  the  uncertainty  in  p.  Then  onlyl  or 
2 terms  of  an  expansion  would  be  needed  in  order  to  bound  the  variation 
in  the  zeros  as  p moves  within  its  ball  of  uncertainty.  See 
Figure  VI I. 4. 

In  the  rest  of  this  chapter  we  will  describe  a new  method  for 
bounding  variations  of  zeros  that  may  be  used  in  situations  like  that 
of  Figure  VII. 4.  This  technique  Is  based  on  finding  a polynomial 
p * p + 6p  which  is  close  to  p and  has  as  high  a multiplicity 
configuration  as  any  in  the  ball  of  uncertainty.  All  its  zeros  are 
well  conditioned,  reflecting  the  fact  that  it  is  far  from  the  next 
higher  manifold,  p would  usually  be  found  by  one  of  the  methods 
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described  in  chanters  HI-V.  When  such  a 
to  be  described  exploits  the  manifold  on 
bounds  applicable  over  the  entire  region 


P is  found,  the  technique 
which  p iies  to  obtain 
of  interest. 
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Figure  VII. 4.  Moving  to  a manifold  to  improve  the  region  of 
convergence.  TayTor  series  expansions  from  p 
converge  only  in  the  shaded  region.  Puiseux 
fractional  power  series  expansions  from 
p = p + Sp  converge  in  a large  region  as  in 
Figure  VII. 2 which  however  omits  points 
arbitrarily  close  to  p.  The  new  expansions 
from  p converge  in  a region  extending  to  the 
next  higher  manifold  and  including  all  of  p's 
ball  of  uncertainty. 
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5.  Resolving  Expansions  into  Components 

Our  task  now  is  to  find  a simpler  method  for  describing  the 
changes  in  the  zeros  of  a polynomial  due  to  perturbations. 

First  consider  a polynomial  on  the  manifold  of  polynomials  with 
one  m-tuple  zero: 

p(-r)  = (i-a)mq(T)  , q(a)  f 0 . 

Vie  want  to  perturb  p to  another  polynomial  on  that  same  manifold: 

p(f)  = (T-a)mq(T)  , q(a)  f 0 . 

The  classical  fractional  Puiseux  series  approach  of  the  previous 
section  attempts  (and  fails)  to  get  from  p to  p along  a straight 
line  in  the  space  , of  polynomials  of  degree  n: 

p(t)  = (T-a)mq(T)  + c[(T-a)mq(t)  - (T-a)mq(T)]  . 

See  Figure  VII. 5, 

We  will  instead  move  along  the  manifold,  regarding  it  as  a 
convenience  rather  than  a barrier: 

p(t)  = It  - (a+  e(a-a))]m[q(-r)  + e(q(r)-q(T) )]  . 

How  the  multiple  zero  stays  multiple,  and  the  change  in  the  multiple 
zero  may  be  easily  expressed  as  a function  of  e.  If  the  multiple 
zero  is  a + n then 

n * (a-a)e 

which  is  certainly  convergent  for  c.  The  changes  in  the  other 
zeros  are  described  by  Taylor  series  in  the  classical  manner.  These 


manifold 
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Figure  VII. 5.  Two  ways  to  get  from  p to  p.  The  classical 
Puiseux  expansion  goes  directly  via  p.  The 
new  expansion  goes  along  the  manifold  via  p. 
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Taylor  series  will  converge  in  some  region  in  the  space  of  polynomials 
of  degree  n-m.  That  region  is  determined  by  the  locations  of  mani- 
folds of  polynomials  with  multiple  zeros  in  the  n-m  dimensional 
space.  These  manifolds  correspond  to  manifolds  of  polynomials  with 
more  than  one  multiple  zero  in  the  original  n dimensional  space. 

For  a specific  example,  if  we  start  with  a polynomial  with  a 
double  zero,  so  m = 2,  we  can  expand  the  zeros  along  the  manifold 
until  we  reach  a submanifold  containing  polynomials  with  two  double 
ze-os,  or  one  quadruple  zero,  or  some  other  configuration  that  implies 
a multiple  zero  in  q + c(q-q).  A submanifold  of  polynomials  with  a 
single  triple  zero,  however,  would  have  no  effect  on  the  expansion, 
fo"  a triple  zero  in  p implies  only  a simple  zero  in  q + e(q-q). 

Obviously  this  approach  can  be  extended  to  polynomials  with 
several  multiple  zeros.  To  g.;t  from 

p(x)  = (n(t-a,)  1 )q(i) 

i 1 
to 

p(i)  = (n(T-ci.)  1 ) q ( x ) 

just  let 

p(i)  = (n(x  - (a.  + e(a.-a.)))  1 ) • (q(t)  + c(q(i)-q(T) ) ) . 
i ITT 

Suppose  now  that  we  wish  to  expand  from  a polynomial  on  a mani- 
fold to  a polynomial  off  that  manifold.  As  we  saw  in  the  previous 
section,  a straight  Taylor  series  expansion  may  be  limited  in  appli- 
cability by  the  presence  of  the  sane  or  other  manifolds.  From  our 
present  vantage  point  it  appears  that  the  procedure  most  likely  to 
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• succeed  would  be  to  expand  along  the  manifold  to  get  as  close  as  pos- 

sible to  the  off-manifold  polynomial  we  seek,  and  then  expand  "ortho- 
gonally" directly  from  the  manifold  to  that  point  with  Taylor  series. 
We  would  thus  minimize  the  effect  of  nearby  manifolds  on  the  conver- 
gence of  the  Taylor  series.  Figure  VII. 6 illustrates  the  notion. 

There  may  still  be  no  reasonable  way  to  expand  from  p to  every 
i polynomial  of  degree  n.  For  instance  consider  the  situation  in 

i Figure  VII. 7.  A self-intersection  singularity,  corresponding  to  a 

\ polynomial  with  two  double  zeros,  means  that  it  is  impossible  to 

$ expand  from  p to  p.  If  our  problem  were,  however,  to  expand  from 

) 

•j  y to  it  might  be  possible  to  do  so  by  finding  a p on  y's 

i manifold  of  polynomials  with  two  double  zeros. 
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self-intersecting  manifold 


Figure  VII. 7.  There  is  no  reasonable  way  to  expand  from  p 
to  [5,  or  even  to  p. 


6.  A Practical  Technique  for  Bounding  Changes  in  Zeros 

In  the  previous  section  we  introduced  the  notion  of  expanding 
along  a manifold  before  resorting  to  conventional  Taylor  or  Puiseux 
series  techniques.  In  order  to  have  a technique  usable  for  hoinding 
changes  in  zeros  as  coefficients  vary,  we  need  to  overcome  two 
problems: 

1)  Apparently  it  is  necessary  to  solve  the  problem  of  finding 
p,  the  nearest  point  on  the  manifold,  for  every  ji  for  which  we  want 
an  expansion.  As  we  have  seen  this  is  a difficult  numerical  problem 
that  is  even  more  intractable  symbolically. 

2)  Our  expansions  have  always  been  defined  in  terms  of  a direc- 
tion r(t)  and  a size  parameter  e.  We  would  like  to  state  the 
expansion  directly  in  terms  of  the  perturbing  polynomial  without 
introducing  the  additional  parameter  e. 

The  second  problem  may  be  solved  fairly  easily  by  letting  e go 
to  1 at  the  end  or  by  ignoring  e altogether.  We  find  that  the  term 
that  was  attached  to  the  kth  power  of  z contains  powers  of  r that 
are  always  greater  than  or  equal  to  k,  and  thus  we  can  construct  a 
series  in  r — whether  r is  represented  by  its  coefficients,  its 
zeros,  or  the  value  of  r and  its  derivatives  at  some  point.  The 
next  section  contains  examples  of  such  series. 

As  for  the  first  problem,  we  might  settle  for  s,  an  approxima- 
tion to  p that  can  be  expressed  symbolically,  s should  be  a satis- 
factory substitute  in  regions  where  the  manifold  is  not  too  wild. 

Figure  VI I. 8 illustrates  the  approximation.  Instead  of  p we 
could  compute  a projection  s of  p on  a tangent  surface  and  map  s 
to  a polynomial  s on  the  manifold.  We  hope  that  s is  reasonably 
close  to  p. 


manifold 


new  i 
expansion 


Figure  VII. 8.  As  a practical  nutter,  the  new  expansion  must 

f™»  f *°  f vU  5 rather  thanp.Tfsa 
polynomial  for  which  p is  the  closest  So  y- 
nomlal  on  the  manifold.  v y 
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Given  p and  p,  s is  uniquely  determined  by  the  norm,  but 
there  are  many  possible  ways  of  mapping  from  the  tangent  surface  to 
the  manifold.  Unfortunately  there  is  no  simple  way  of  insuring  that 
s = p when  p ’s  already  on  the  manifold.  Any  discrepancy  in  this 
case  is  intolerable  because  it  leads  to  the  situation  in  Figure  VII. 9 
with  its  familiar  problem  of  short  radii  of  convergence. 

Any  expansion  technique  for  arbitrary  p must  somehow  recognize 
when  p is  on  the  manifold.  A vanishing  discriminant  is  example 
of  a condition  characterizing  polynomials  on  the  manifold.  But  such 
characterizations  are  too  complicated  to  be  useful. 

The  notion  of  expanding  along  the  manifold  may  still  be  put  to 
good  use,  however,  if  we  only  seek  bounds  on  changes  in  zeros  rather 
than  explicit  expansions  in  terms  of  a perturbation.  Thus  given  p 
with  zeros  9^  of  various  multiplicities,  we  may  ask  for  bounds  on 


for  zeros  8^  of  polynomials  j5  such  that  ip-jil  <_  A.  See  Figure 
VII. 10.  The  variation  of  8^  with  respect  to  9^  can  be  thought  of 
as  having  two  components,  one  due  to  motion  on  the  manifold  and  one 
due  to  motion  orthogonal  to  the  manifold.  If  we  can  bound  these 
changes  separately  and  independently  then  we  can  add  the  bounds  to  get 
the  overall  variation. 

Taking  a closer  look  at  the  components  of  p-p,  recall  that 

p(t ) « (T-a)mq(T)  , 
p(t)  « (T-a)mq(T)  , 

where 


manifold 


s 


\ 

') 

V* 

I 


( 


^nortcormng  in  revised  exoan sinn  m 
{ ,s  9"  the  manifold  tK!<?  TOthod  whe' 
from  s to  p is  doomed  Puiseux  expansion 
Of  convergence.  ‘°  have  a *l»rt  radius 
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* 

i Figure  VII. 10.  How  do  the  zeros  vary  as  p paries  within 

the  small  ball  centered  on  p?  A bound  may 

i he  computed  by  studying  the  variation  In  the 

j zeros  as  p varies  within  the  larger  ball 

centered  on  p. 

i 


t 

# 

* 

* 

» 

t 
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a = a + 6a 
q = q + 6c,  . 

q is  a monic  polynomial  of  degree  n-m;  6q  is  not  monic  and  is  of 
degree  at  most  n-m-1.  Then 

js-p  . {T-a)miq(r)  * J (™) (T-a)m-J (q+6q) (t) (-«0) J 

j*l  J 

where  (I!1)  s m!/(j!(m-j)!).  We  will  mostly  be  interested  in  the 

J 

infinitesimal  case  for  which  we  need  not  be  concerned  about  the  higher 
order  terms. 

Summary  of  the  New  Technique 

Before  looking  at  details  we  summarize  the  new  technique. 

We  are  given  a polynomial  p with  a norm  and  a bound  on  the 
uncertainty  in  p.  We  want  a bound  on  the  corresponding  uncertainty 
in  the  zeros  of  p. 

The  ball  representing  polynomials  practically  Indistinguishable 
from  p contains  some  polynomials  p with  multiple  zeros.  By  the 
numerical  means  discussed  in  chapters  III  to  V,  we  locate  the  poly- 
nomial p nearest  to  p with  all  zeros  well  conditioned;  some  are 
therefore  multiple.  Then  we  may  determine  a ball  about  p that  con- 
tains the  original  ball  about  p and  which  is  usually  only  slightly 
larger.  Then  we  may  bound  the  variation  in  the  zeros  of  polynomials 
p in  this  second  ball. 

To  do  so  we  first  construct  symbolic  expansions  for  the  changes 
i«  the  zeros  of  p due  to  moving  to  another  polynomial  p on  the 
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same  manifold  but  within  the  second  ball  (Figure  VII. 10).  For  the 
multiple  zeros  a these  expansions  from  a have  only  two  terms  but 
for  the  simple  zeros  6 these  expansions  from  6 are  Taylor  series 
in  the  perturbation  5q. 

flow  we  compute  expansions  from  p to  points  p which  lie  on  the 
planes  normal  to  the  manifold  at  p.  These  symbolic  expansions  are 
Puiseux  fractional  power  series  to  get  zeros  a from  the  multiple 
zeros  a and  Taylor  series  to  get  zeros  8 from  simple  zeros  6. 

The  series  are  in  p-p  which  is  orthogonal  to  the  manifold  at  p. 

Then  we  substitute,  again  symbolically,  the  series  for  5 and  B 
in  the  second  sets  of  series  to  obtain  series  for  a and  £ which  do 
not  contain  5 or  B.  Finally  we  may  convert  the  numerical  bound  A 
on  the  size  of  the  second  ball  into  numerical  bounds  on  the  terms  of 
the  series  for  a and  8. 

It  is  essential  to  study  an  example  to  understand  the  technique. 
The  example  given  in  the  next  section  is  simplified  but  contains  the 
essential  ideas. 

The  method  just  described  ought  to  be  compared  to  one  based  on 
the  results  of  Brian  Smith  '^Z).  Smith  uses  Gerschgorin  circles  to 
obtain  bounds  for  the  zeros  of  a polynomial  subject  to  uncertainty  in 
its  coefficients.  Smith's  bounds  are  easier  to  compute  than  those 
based  on  expansions,  but  they  may  be  unrealistic  by  a factor  that  is 
proportional  to  the  degree  of  the  polynomial.  However,  they  are  valid 
for  finite  as  well  as  infinitesimal  perturbations,  unlike  the  new 
method.  Comparative  evaluation  of  the  two  bounding  methods  must  be 
postponed  until  the  new  bounds  can  be  computed  automatically. 


1D7 


where  A is  the  m-1  by  n matrix 


A = 


e* 

e*D 


e*Dm"2 


e*  = (an  ^ an  2 • • • 5 1),  which  depends  on  a,  hence  the  ~ in  A. 
This  A should  not  be  confused  with  the  m by  n+1  matrix  A of 
chapters  III,  IV,  and  V.  A or  e without  ~ means  5 = a.  61  is 
an  m-1  vector  which  is  infinitesimal  like  6q  and  6a.  To  first 
order  W_1A*6£  * W-1A*6t,  so 


1 


p ! p-p  t W A*6fc  + Pm_i  P-j  <$q  - mPm_^  q6a 


W'V 

P ,P. 
m-1  1 

-mP  ,q 

m-r 


61 

6q 

6a 


= M6h  . 


The  matrix  operator  M is  n by  n and  invertible  so  a specific  infini- 
tesimal perturbation  6p  may  be  mapped  into  61,  6q,  and  6a,  the 
components  of  6h. 

We  would  like  to  define  a region  in  6h-space  whose  image,  mapped 
into  6p-space,  is  the  ball  I6ply  <_  A.  Obviously  that  region  is  just 

{6hJ I6hlH  <A) 

where  I 6hl H = IM5hI w.  For  infinitesimals  with  quadratic  norms  this 
approach  is  practical. 


Best  Possible  Bounds  for  Changes  in  Zeros  Due  to  Variations 
Over  an  Infinitesimal  Ball 
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To  see  how  to  get  the  infinitesimal  bounds  in  a series  expansions 
let 

(6.1)  llopll2  = 5p*WSp  = 5h*M*WM6h  = 5h*H6h  = llfihll2 

where 


and 


H = 


AW"1 A*  0 0 

0 P]*XP]  -mP^Xq 
0 -mq*XP1  m2q*Xq 


X = P *WP  , 
m-1  m-1 


The  zero  entries  in  H arise  because  AP  , * 0. 

m- 1 

Suppose  we  want  to  compute  the  first  two  terms  of  an  infinitesimal 
bound  for  the  zeros  a of 


p(t)  = p(t)  + 6p(t)  » (T-a)2q(i) + 6p(t)  . 


The  change  due  to  the  move  from  p to  p is  just  6a.  The  orthogonal 
direction  is  W”1  A*5Jl  * W_1e6X  where  e*  is  the  evaluation  functional 
for  a and  6X  is  a scalar.  Then  nsing  (2.6), 


,-l 


*(t)  ■ « . 

a- a s /x(a)  + yx'(a)  + •••  . 


But  x(a)  = £X (W_1e(a))/q(a)  which  is  just  a constant  tines 
Likewise  x'(a)  is  just  a different  constant  >2  tines  £>..  Thus 


a - a * /^(5X}1/2  + (v^f-X+fia)  + ••  • . 


How  large  can  these  terms  become,  given  that  B 6hl ^ £ A?  The  maximum 
value  of  1 6X | ^ is  A^/(e*W_1ea)  so  for  the  first  term. 


l^(«x)1/2| 


As  for  the  second  term, 


/ 

r 

I 1 B(y2  0 1)»hA  = /{y2  0 DH'1 

y2 

|(y,  0 1) 

C 

6q 

0 

J. 

Such  bounds  are  achievable  by  6h  satisfying  06hJu  < A and  so  are 

n — 

best  possible. 


A Region  Circumscribing  an  Infinitesimal  Ball 

The  method  just  outlined  is  best  possible  for  perturbations  that 
are  infinitesimal  or  essentially  so.  Sometimes  we  may  be  content  with 
bounds  that  are  not  optimal  but  hopefully  are  realistic. 

To  that  end  rewrite  (6.1)  as 


where 


I Apr  * 6g*V6g 

W 


f 1 0 0 

0 1 v 
0 v*  1 


(P1*XP1)‘1/2P1*Xq 

' 


v 
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and 


sg 


W'1/2A*5£ 

(P1*XP1)1/26q 

m(q*Xq)1/25a 


Then  we  might  let 

JW"1/2A*6£l0  * IW’VfiU  < A , 

2 w — 

I(P1*XP1),/2«ql2  = < A , 

tm(q*Xq)^2Sal2  * mlPm_jqlw|6o|  < 4 ; 


but  depending  on  v,  we  might  find  that  the  image  of  the  region  so 

defined  does  not  contain  the  ent^'-e  ball  B opt, , < A.  If  q = X_1e 

w — ^ a 

then  v * 0 and  the  image  is  just  the  ball,  while  if  q = P^u  then 
Ivl  * 1 and  the  image  is  not  an  n-dimensional  ball  or  ellipsoid  but 
something  of  lower  dimension  which  can  not  possibly  contain  the  ball. 
To  see  what  is  going  on,  suppose  I6p*w  * A exactly  and  6£  = 0. 
How  large  can  6a  and  5q  become?  He  have 


A2  * 

’ (P1*XP.j)1/26q  ' 

* 

M v ' 

' (P1*XP])1/26q  ' 

A * 

m(q*Xq)1/25a 

[ v*  1 

. m(q*Xq)1/26a 

so 

6q*(P^*XP^)6q  + m2q*Xq | 6a | 2 * A2/minev 
where  "minev"  means  the  smallest  eigenvalue  of 


( 5 V1 

v*  1 


202 


But  the  eigenvalues  of  that  matrix  are  just  1,  of  multiplicity  n-2, 
1 - !1  vfl 2 * and  1 + H vfl So  at  worst 


l6ot|2i-2“  A 

m n* 


m q*Xq(l-lvl2) 
tPm-lPl6qlW  - l-tvl0  : 


where 


2 q*XP1(P1*XP1)’1P1*Xq 

~ q*5Jq 


llvB|  = 


Therefore  our  constraints  should  read 


{fl 6£ll L = BW'1A*6£IW  < A£  = A , 

B6q»Q  = «Pm_lPl6qlW-Aq  = A/0-|vl2>1/2  . 

|6a|  < = A/(mlPm_iqiw(l-lvl!2)1/2)  . 

The  image  of  such  an  infinitesimal  region  does  indeed  contain  the  ball 
B6pB w < A,  and  in  fact  circumscribes  it;  the  question  remains:  how 

much  larger  Is  the  Image  than  the  ball?  If  St,  5q,  am:  6a  have 
bounds  A^,  Aq,  and  in  the  proper  norms,  then 

Thus  bounds  based  on  (6.2)  will  be  realistic  if  and  only  If  | vl ^ « 1. 

It  turns  out  that  lvl2  « 1 if  and  only  if  P ^q,  which  has 

an  m-l-tuple  zero  a,  is  far  from  the  nearest  polynomial  P ,P,u 

m- ! l 

with  an  m-tuple  zero  a.  To  see  this,  solve  the  least  squares  problem 

"find  u to  minimize  IP„  ,q-P„  ,P,ul,,"  to  get 

m- 1 m- 1 i w J 
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so 


"W1 VlPlU,W  ■ '’‘Vi*”1'2'1  -“,/2pn,-lP1lwl/2VlPl)t>H,/2Vl*1' 

= q*Xq  - q*XP1(P1*XPir1P1*Xq 
* q*Xq(l  - flvl|) 


and 


I!  v!J  2 = 1 - 


nPm-lq~Pm-1P1u8W 

"■-li'S 


Recall  from  section  II. 3 that  the  condition  number  y of  the 
multiple  zero  a is  inversely  related  to  the  distance  to  the  next 
higher  manifold.  In  fact,  from  the  definition  of  condition  number  in 
II. 4 we  know 

Y i i TOT 

for  any  y of  degree  n-m  or  less.  Take  y = q-P^u  in  particular 
to  see 

whence 

1/(1  - lwl2)  < n^p^^ljd  tivljh2 

‘ 2™2"’n-l'"2|Y2  ' 

Thus  we  have  demonstrated  the 

Proposition.  If  the  condition  number  of  a is  small  then  the 
image  of  the  infinitesinal  region  defined  by  (6.2)  is  not  much  larger 
then  the  infinitesimal  ball  I6plw  < L . 
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Bounds  for  Changes  in  Zeros  Due  to  Variation 
Over  a Region  Circumscribing  a Ball 

When  it  is  inconvenient  to  bound  the  changes  in  the  zeros  by  use 
of  (6.1)  we  can  resort  to  (6.2).  If  the  zero  a is  well  conditioned 
and  the  ball  is  not  too  big  then  we  have  confidence  that  the  error 
bounds  we  derive  are  not  much  larger  than  necessary. 

So  suppose  that  A^,  Aq,  and  Aq  bound  6)1 , 6ct,  and  6q.  How 
can  the  zeros  of  p vary  subject  to  these  bounds?  Let  a be  the 
multiple  zero  and  B a simple  zero  of  q.  First  consider  possible 
changes  due  to  motion  along  the  manifold.  Let  5 and  £ denote 
corresponding  zeros  of  a polynomial  p along  the  manifold.  Trivially 

|u-a|  < A . 
i ' — a 

To  get  B it  is  necessary  to  construct  a Taylor  series  expansion.  B 
is  a simple  zero  of  q;  B a simple  zero  of  q + 6q.  Let 
q(x)  * (T-B)qB(x)  and 

x(t ) = -6q(T)/qg(B) 

as  in  (2.5).  Then 

S-  B * x(B)  +x(B)x'(B)  + ••• 

IM  I |x(B)i  + | x(  B)  1 1 x * (6 ) | + . 

We  can  use  I5qlp  < A^  to  obtain  bounds  for  these  terms.  For  instance, 

5q(3)  « e|6q 

where  e*  is  the  functional  that  evaluates  a polynomial  at  B.  Then 
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Now 


!{<l(S)|  < le*lQHq|(!  < |e*y 


#eS"Q  * ,e6(PlVfWV1Pl)':esl2 

which  is  a constant  that  may  be  evaluated.  So 

He  *1 

and  succeeding  terns  nay  be  calculated  in  the  sane  way.  The  bounds 
-an  be  calculated  with  just  a few  terms  if 


Thus  we  may  bound  the  change  in  a 


manifold. 


qg(B)  is  not  too  small, 
and  b due  to  movements  along  the 


Next  to  consider  are  changes  due  to 
manifold.  Suppose  we  are  at 


movements  orthogonal  to  the 


P(t)  = (T-a)mq(T)  , 


and  3 is  a zero  of  q.  Then  an  orthogonal  perturbation  is 


w A *61. 


define 


To  see  what  happens  to  s .,c.  , , 

PPens  to  o.  use  a formula  such  as  (2.6).  rirst 


*(t)  ■ -(tr^A*it)(T)/q(Tj  . 

then  for  a,  a 2ero  of  p « p + W_1A*5£, 

(x(5))'/"+  , ...  . 


Now 


x(5)  * -§#W_1A*6£/q(i) 

'x(a)!  < |§*W"1A*<5£|/,q(5)j 

If  h are  the  zeros  q.  then  \m)\  = n:s-8  I 
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A lower  bound  may  be  calculated  by  using 

15-^1  1 - Aq  - A6i 

where  AB^  is  the  bound  for  | 6^ -6^ | computed  previously. 

As  for  the  other  term, 

|e*W“1A*6£|  < lg*W"1A*lLl6tl|_  < le*W'1A*l1_.AJl  ; 

Je*W_1A*0L  = le*W"1A*(AW'1A*)“1AW’1el2  . 

Since  |a|  -Aa_<  |a|  < |a|  + Aa  we  can  compute  a bound  for  |x(a)|  and 
for  the  other  terms  of  | a-S | . 

Similarly  we  can  compute  a bound  for  | B-6 ! for  B,  one  of  the 
other  zeros  of  2j.  The  process  is  similar  to  that  for  |B-B|. 

Obviously  these  derivations  would  be  much  less  tedious  if  a 
suitable  algebraic  manipulation  system  were  available  to  do  part  of 
the  work. 

Sc  far  it  may  not  be  apparent  that  the  process  described  is  much 
of  an  improvement.  A simple  example  in  the  next  section  shows  that 
the  payoff  can  be  substantial . 
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7-  An  Example  of  Expansinnc 

We  W1”  8PP,y  b°th  the  C,assi“’  new  expension  technics 

to  on  example.  It  win  become  evident  that  the  new  expansion  tecb„1q„e 

'S  Very  mUCh  <‘ePe"dent  0n  * *»*•"«  manipulation  system  like  MACSYMA 
or  REDUCE  [38]  for  ns  successful  implementation.  Even  though  the 
example  we  provide  is  somewhat  contrived,  the  aTOUnt  of  algebra 
required  is  substantial. 

«e  will  study  the  zeros  of  polynomials  in  the  neighborhood  of  the 
real  cubic 


p(t)  - T3  - (1+4)tZ  - (1+3)t  + (1-5)  , 


with  6 * IE-6.  Its  three  simple  zeros 


are 


S'-  .39999975  , 

* .99877563  , 

«2  « 1.00122512  . 

The  last  two  of  these  are  somewhat  ill  conditioned.  Ue  will  use  the 
uniform  none  in  which  ,11  weights  are  then  the  condition  „„nbers  of 

a"d  ■*  ,ra  ab0U*  350i  the  number  of  8 is  about  .43. 

The  results  are  given  in  Tables  VII. 1 ,„d  VII.2.  p „ the 

' nomial  w.„.  .eiOS  a,,  and  6.  p ,s  th#  near(!st 
polynomial  with  a double  zero: 


P(x)  ■ (T-a)2(T-s) 


where  a « 1 and  6 * -1.  Finallv  8 * 

nnaiiy  B and  a represent  zeros  of  an 

arbitrary  polynomial  f such  that  r.ji.p  w1th  ,r,  . 4>  of 
• P - p with  |r|  < t. 


Table  VII. 1.  Expansions  to  p 


Classical  Tavlor  series 

From  B = -.99999975,  a simple  zero  of  p: 

S ' -25?(8)  f .25r(|){.25f(8)  + .25f*(i)}  + 0(?3)  . 

Frc  , 8 * -1,  a simple  zero  of  p: 

5-6-ir(B)+^(6){ir(B)+ir'(6))  + 0(r3)  . 

From  a,  - .99877563  or  Oj  • 1.00122512,  simple  zeros  of  p 

“l  ’ 51  + 204fS(S,)  * 204r(o1){83282r(5])  + 204f'(ai))  + 0(  ’ 
“2  * “2  ‘ 204f(“2>  - 204fCS2)(83386r(S2)  - 204r'(S2)l  + 0( 

Classical  Pulseux  fractional  power  series 
From  a * 1 , a double  zero  of  p: 

a » a + /pM  + p|r(a)  -r'(a)} 

+ J^pM{-p(a)+P'(a)  - p"(a) 

+ 0(r2) 

Expansions11  based  on  the  new  technics 
From  8 * -1,  a simple  zero  of  p: 

6 « 8 - 6q 

§ « § ♦ xg(6)  ♦ xg(g)xj(§)  ♦ 0(x|) 


(p(a)-r'(a))2 
JrftJ > 


~t>  -»> 


f 


From  a = 


* 
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1 , a double  zero  of  p: 


5 = a + 6a 

a = a + /x~(S)  + lx! (a) 
a 2 a 


(xi(2))‘ 


•1  \A  0 

+ rx^s>(«a(5)  *— °-TaT  ’ + 0(x5  1 
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Table  VII. 2.  Bounds  on  Zeros 

Crude  bounds  based  on  classical  expansions 

|S-B|  f.  .43A  + .43A2  + 0(A3) 

|fc-B|  £ . 43A  + .43A2  + 0(A3) 

|a2-a2|  < 353A  + 5.1E7A2  + 0(A3) 

| a-a | < .93A1/2  + .78A  + (.60  + )A3/2  + 0(A2) 

vTr(o) |/A 

Crude  bounds  based  on  the  new  technique 

|5-B|  £ .84A  + .38A2  + 0(A3) 

| a-a | < . 93A1/2  + 1.00A  + .66A3/2  + 0(A2) 

Best  possible  bounds  based  on  classical  expansions 

1 6-B j £ .43A  + . 078A2  + 0(A3) 

IB'S!  £ .43A  + .078A2  + 0(A3) 

|a2-a2l  £ 353A  + 5.1E7A2  + 0(A3) 

| a-a | < .93A1/2  + .42A  + (.13  + ^ )A3/2  + 0(A21 

A r('o)  | /A 

Best  possible  bounds  based  on  the  new  technique 
|B-8|  £ . 43A  + 0(A3) 

| a-a | £ .93A1/2  +.42A  + .0084A3/2  + 0(A2) 
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Table  VII. 1 represents  expansions  to  p from  p and  p.  There 
is  little  difference  in  the  expansions  for  §,  but  the  difference  for 
a is  remarkable.  Starting  from  the  ill  conditioned  zeros  a,  the 
Taylor  series  terms  have  huge  coefficients  reflecting  short  radii  of 
convergence.  In  contrast,  the  fractional  power  series  expansion  from 
the  double  zero  at  a has  modest  coefficients  but  exhibits  a different 
kind  of  shortcoming:  in  certain  directions  the  fractional  power  series 

does  not  exist  at  all,  namely  those  directions,  tangent  to  the  manifold, 
such  that  r(a)  = 0.  Then  the  coefficient  of  the  third  term  becomes 
infinite  because  its  denominator  contains  (r(a))  ' . As  we  have  seen, 
in  this  direction  the  proper  series  expansions  consist  of  a trivial 
one  a = a and  a Taylor  series  in  integral  powers  of  r.  It  is  easy 
enough  to  bound  changes  in  that  special  direction;  the  severe  problem 
is  that  when  r(a)  is  not  zero  but  is  small  compared  to  Jr|,  the 
terms  in  which  r(a)'1  appears  have  huge  coefficients. 

"Expansions"  are  also  given  in  the  form  produced  by  the  new 
technique.  These  expansions  arc  not  useful  until  converted  into 
bounds,  since  they  are  not  in  terms  of  a perturbation  r but  rather 
depend  on  the  unknowns  5 or  B,  and  on  x,  which  is  defined  below 
in  terms  of  an  orthogonal  perturbation. 

Table  VI I. 2 shows  bounds  for  the  changes  in  the  zeros  based  on 
the  expansions.  The  table  gives  both  "crude"  bounds,  which  reflect 
the  simplest  approximations  that  come  to  mind,  and  "best  possible" 
bounds  which  reflect  a finer  analysis.  An  automatic  symbol  manipulator 
might  produce  rath***  . , ude  bounds  while  the  best  possible  bounds  would 
likely  be  produced  by  a human  analyst. 
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The  bounds  for  § are  not  of  much  interest.  The  bounds  for  a 
reflect  the  same  difficulties  as  the  Taylor  or  Puiseux  series  from 
which  they  were  derived.  The  interesting  part  of  Table  VII. 2 shows 
bounds  for  small  A based  on  the  revised  expansion  techniques  dis- 
cussed in  the  previous  section.  The  important  improvement  is  that  the 
bound  for  jcx-a|  is  now  independent  of  the  direction  of  r and  all 
the  coefficients  are  of  modest  size.  Furthermore  the  first  two  terns 
are  the  same  as  the  best  classical  bound.  The  new  technique  may  be 
used  for  bounding  until  A becomes  comparable  to  |a-B|. 

Thus  this  example  vindicates  the  approach  advocated  in  the  previous 
section.  The  rest  of  the  current  section  provides  the  details  of 
computing  Tabes  VII. 1 and  VII. 2.  Those  details  provide  convincing 
evidence  that  practical  exploitation  of  the  new  expansion  technique 
requires  a sophisticated  symbol  manipulation  system. 

The  bounds  computed  by  Smith's  method  [42]  are  somewhat  larger 
than  those  in  Table  VII. 2.  In  particular,  that  method  indicates 

|a-ct|  < 1.32A1/2  + 0(A)  . 


Details  of  Expansions 

We  first  construct  the  expansion  from  p.  If  we  consider  a per- 
turbation er( t)  to  (T-a.)q^(t)  we  find,  according  to  (2.5),  that 
the  perturbed  zero 

c^.  * +x(a^)e+  x(a^)x' (a^)e2  + ••• 

where 

x(t ) = -r(T)/q.(x)  . 
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Thus  if  i = 1 then  ^(t)  = (t-&2)(t-&3)  ’»  a3  = §.  Also 

x(a.j ) = )/(a1-a2)(a1-a3)  , 

r(a1 )(2a1-a2-a3)  r' (a^ ) 

x (otl)  ((ara2){ara3))2  (s1-a2)(a1-a3)  * 

We  may  represent  the  polynomial  r by  the  value  of  r and  its  deri- 
vatives at  ai  or  by  its  coefficients.  Using  coefficients, 

r (t ) = r^T  + r2x  + r3  , 

A • A . A A P A A A 

r(a^)  = r^a-j  + r^  + r3  , 
r'(a})  = 2fiai  + r2  . 

Finally  let  e 1 to  obtain  a Taylor  series  in  the  coefficients  of  r. 
Notice  that  in  the  first  order  term  those  coefficients  appear  linearly, 
in  the  second  order  term  they  appear  quadratically,  etc.  Substituting 
numerical  values  yields 

6*8-  . 25r  ( B)  + . 25r(  B)  { . 25r  ( B)  + .25r  (B)>  + 0(r3)  , 
a]  = a]  + 204r(ai ) + 204r(ct1  ){83282r (a] ) + 204r'  (c^ )}  + 0(?3)  , 
a2  = a2  - 204f(a2)  - 204r(a2){83386r(a2)  -204f'(a2)}  + 0(r3)  . 

The  expansions  for  and  &2  look  unlikely  to  converge  for  other 
than  small  r;  in  fact  there  is  a polynomial  p with  a double  zero 
at  distance  Irl  $ 1.7E-6. 

We  now  consider  expansions  from 

p(r)  = (t-o02(t-B) 

with  a = 1 and  £ ~ -1.  We  will  compute  the  effect  of  a perturba- 
tion r(t)  = p(t)-p(t)  on  a and  B.  For  6,  following  (2.5), 
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Bounds  from  Expansion 

The  changes  in  zeros  may  be  crudely  bounded  in  a straightforward 

way: 

IM  < .25|r(8)|  + ^|r(B)|{|r(B)|  + |r'(B)|)  + 0(r3)  . 

But  |r(B)|  < »(B2  B 1 ) fl • «rll  < A A.  Similarly 
I r ' ( B)  | < B (26  1 0) II D r I < A A.  So 

|£-B|  = .433A  + .430A2  + 0(A3)  . 

The  bounds  for  j 5-6 1 and  |a-ct|  are  similarly  derived.  As  we  have 
seen,  bounds  for  ja-a|  independent  of  r do  not  exist. 

We  can  improve  on  these  bounds  by  taking  a little  care.  For 
instance,  the  second  term  in  the  expansion  for  £ - B is 

r(B){r(B)  + r’(B)}/16  . 

p 

Writing  r(x ) * r^T  + r2T + r3  we  ^lnd  tbat  tenn  becomes 

(rrVr3)(’r1+r3)/16  • 

Then  the  question  is:  how  large  can 

I (rrVr3)(-ri+r3>l/16 

2 o 2 ? ? 

be,  subject  to  the  constraint  Irl  ' k1r+  |r2|£  + |r3r  * er  ? This 
problem  in  non-linear  optimization  can  be  solved,  for  instance  with 
a Lagrange  multiplier,  to  find  that  the  desired  maximum  is  .0776a2. 
Similarly  the  second  term  in  the  expansion  for  a -a  is 
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While  we  could  bound  the  term  as 


\{\\  r(a)  | + I r'  (a)  | } < /3A/8  + /5A/4  = . 776A  , 


we  do  better  to  observe  that  r(a)  = ri  + r2  + r3  so  we  wish  to  maximize 


e ^ 


< (/n/8)A  = .415A  . 


Bounds  from  the  New  Technique 

Now  consider  how  the  zeros  change  when  subject  to  perturbations 
of  the  form  discussed  in  section  6.  First,  p is  perturbed  to 

p(i)  = (i-a)2(i-g) 

by  movement  along  the  manifold.  Then,  an  orthogonal  perturbation 

5Xi  - 5X((a*)n_1  (a*)""2  •••  a*  1) 

is  applied.  The  total  perturbation  should  be  commensurate  with  a 
which  to  simplify  matters  will  be  taken  to  be  no  larger  than  10"4. 
Corresponding  to  the  bound  Irl  <_  A for  the  conventional  expan- 


5X1 

a 

,2 

’ 5X ' 

★ 

’ 5X 

1 «q 

L 5a  _ 

>h‘ 

5q 
. 5a . 

H 

5q 
. 5a 

sion  we  have  (6.1): 
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To  compute  the  components  of  H,  note 


So 


AW-1 A* 


' 1 O' 

-1  1 , 
, 0 -1  . 


X 


1 

q 


Xq 

P-jXq 

q*xq 


2 

-1 


» 


H 


’3  0 0' 
0 6 0 
0 0 8. 


We  will  compute  the  best  possible  bounds  from  H,  but  for  the  crude 
bounds  we  will  use  (6.2).  Then  v * 0 so  (6.2)  becomes 


| 6X|  < , 

|5q|  - 1 66 1 < (^/6)A  , 
and  | 6a | < (^?/4)A  . 

In  the  usual  case  when  deg  q > 1,  56  is  a Taylor  series  in  cq. 
The  variation  in  the  double  zero  a and  the  simple  zero  8 is 
thus  easily  bounded  for  movements  along  the  manifold.  Now  we  turn  to 


the  effect  of  the  orthogonal  movement  in  the  direction  6X§.  The 
effect  on  8 may  again  be  deduced  from  (2.5);  let 
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so 


Then 


and 


x(t)  = 


-SX 

(t-5)Z 


n-3 


x 1 (x)  - <$X{ 


2y(5*T)n~j  a*y(n-j)(5*x)n~j~1 


73 


(T-a) 


(T-o)‘ 


x ( 8)  « -6X(y(a*8)n‘j/(8-S)2) 


x‘  (8)  - 


(8-5)' 


(W? 


■}  . 


Since  |a-a|  £ A and  | B-B | £ A and  A £ 10"^,  in  the  bounds  that 
follow  no  harm  is  done  by  substituting  a for  a and  8 for  8, 
since  the  resulting  coefficients  will  only  be  given  to  3 figures.  For 
larger  A more  care  must  be  taken.  In  particular,  if  the  perturbation 
along  the  manifold  is  extended  far  enough  to  reach  the  next  higher 
manifold,  where  5 * B,  the  bounds  below  will  be  utterly  wrong. 

To  get  a crude  bound,  we  would  use 


I x ( B) | £ 1 6X | (y|a*8|n”'j/| B-5|2)  « ||6X|  £^A  , 
I x*  (8)|  £ |«X|«|<  ^ A . 

Then 

|S-8|  < |x(B)|  + | x( B) 1 1 x * ( B) | + ••• 

< (3/4) A + 3/8  A2  ♦ •••  . 


Since 


! 6-S i » Usl  < (3/6) A , 


we  get 
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IM  < <44^ + 1"2  + 

< .84U  + . 375A2  + 0(A3)  . 
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Substituting  we  find 


x(a)  = -6XQ|an'J'|2/(a-S))  , 

x'(fi)  = 

(S-8)2  (5-6) 


x"(a)  = ^{L5*-)2I(n-j)(n-j-1)lan'J'2|2  2a*y(n-,i ) Ig""^1 12 


~n-j-2|2 


(5-B) 


^-j.2 


(a-sF 


-}  . 


Then  to  get  a crude  bound, 


|x(5)|  = ||6X|  < (^)a  , 

lx'(5)l  £ l<SX|(|+|)  < , 

lx"(5)|  < |6X|(l+f+|)  < (1^1)a  . 


Since 


a - a * 6a  + »£[aT  + ix'(a)  + VxU?)  {*''{&)  + ... 

2 x(a) 

then 

i«i  <-§^n  * ♦ ... 

or 

!o-a|  < .93lA1/2  + 1 . 003A  + .663A3/2  + 0 (A2)  . 


To  get  the  corresponding  best  possible  bounds,  note  that 


x(a)  = - |i5A  , 
x'  (5)  = - |$A  , 

x"(a)  = ^A  . 

Then  for  the  second  term  6a  + ^-x'(5)  we  have 

|6a-|$A|  < ll(-|0  1 )Hh£  = ^A  . 


For  the  third  term. 


^(x..(5)  + (>CML) 


.0084A3/2 


CHAPTER  VIII 


EXPERIMENTAL  METHODS 


1 . Introduction 

In  the  next  chapter  experimental  results  will  be  given  which 
vindicate  the  theory  of  previous  chapters.  After  that  we  will  pre- 
sent experimental  results  for  a class  of  polynomials  more  difficult 
to  understand. 

In  the  present  chapter  we  describe  how  the  nearest  polynomials 
with  given  multiplicity  configurations  were  found.  Then  we  explain 
the  tests  made  to  assure  the  validity  of  the  results.  Finally  we  show 
how  to  contrive  test  problems  with  known  answers. 

Experiments  were  carried  out  on  the  CDC  6400  at  the  University  of 
California,  Berkeley.  Coding  was  in  the  FORTRAN  language  for  the 
University  of  Washington  RUN  compiler.  Although  most  of  the  codes 
usually  perform  satisfactorily  in  the  stated  environment  they  are  not 
presently  in  a portable  form  that  would  work  reliably  in  other  envi- 
ronments. Consequently  a detailed  discussion  and  listing  of  these 
codes  is  not  included  here. 
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-•  How  the  Equations  were  Solved 

Chapters  III-V  presented  various  equations  to  be  solved  for 
solutions  c corresponding  to  nearest  polynomials  with  one  or  more 
multiple  zeros.  Expressions  were  usually  obtained  both  for  a function 
and  its  partial  derivatives  so  that  Newton's  method  could  be  applied. 

To  use  any  iterative  method,  however,  starting  guesses  must  be 
supplied. 

Usually  the  staiting  point  was  taken  to  be  a zero  of  the  appro- 
priate derivative.  Thus,  if  the  nearest  polynomial  with  a double  zero 
was  sought,  a starting  point  would  be  chosen  from  among  the  zeros  of 
the  first  derivative.  One  might  try  to  use  the  zeros  of  the  original 
pc  ynomial,  but  the  zeros  of  the  derivative  seemed  more  often  to  lead 
to  faster  convergence. 

In  order  to  maximize  the  probability  of  first  finding  the  globally 
nearest  polynomial  with  the  desired  multiplicity  configuration,  the 
starting  points  were  tried  in  a definite  order.  That  order  was  fixed 
by  computing  the  distance  to  the  nearest  polynomial  with  that  start- 
ing point  as  a double  zero.  That  distance  is  an  upper  bound  for  the 
distance  to  the  manifold  from  that  starting  point.  The  starting  points 
with  the  least  upper  bounds  were  used  first. 

The  same  criterion  for  choosing  among  starting  points  could  be 
used  if  the  starting  points  were  the  zeros  of  the  original  polynomial. 
In  this  case,  however,  it  would  be  equally  appropriate  to  rank  the 
starting  points  according  to  their  condition  numbers. 

Once  a starting  point  was  chosen,  Newton's  method  was  used  in  all 
but  one  instance.  That  case  exploited  the  fact  that  the  equation  for 
the  nearest  polynomial  with  a double  zero  always  has  a real  solution 
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between  two  real  zeros  of  a real  polynomial.  Those  two  real  zeros  may 
be  used  as  starting  points  for  a secant-like  iteration  for  ;;  among 
many  such  iterations  Brent's  [2]  is  a well  known  recent  one.  Brent's 
method  was  used  to  quickly  locate  real  solutions  whenever  appropriate. 

In  order  to  terminate  the  iteration  an  error  bound  on  the  func- 
tion evaluation  was  computed.  When  the  function  whose  zero  was  sought 
was  reduced  below  its  error  bound,  the  ^Lirrent  iterate  was  accepted  as 
a zero.  These  error  bounds  were  usually  computed  with  the  aid  of 
interval  arithmetic  (24].  The  lack  of  suitable  facilities  for  inter- 
val arithmetic  in  CDC  hardware  and  software  made  it  necessary  to  code 
interval  operations  as  subroutine  calls  --  making  the  codes  for  the 
functions  virtually  unreadable,  and  thereby  providing  another  reason 
for  not  publishing  those  codes  here. 

If  no  solution  was  found  after  a fixed  number  of  iterations 
(usually  40)  the  iteration  was  terminated  and  another  starting  point 
tried.  If  a solution  was  found  it  was  added  to  the  list  of  known 
solutions  used  to  deflate  the  function,  as  described  in  one  of  the 
appendices. 

When  all  the  reasonable  starting  points  had  been  tried  the 
accumulated  solutions  were  checked  for  correctness  and  the  correspond- 
ing perturbations  analyzed. 
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3.  How  Do  We  Know  the  Answers  are  Correct? 

The  methods  just  described  produce  one  or  more  solutions  ; 
corresponding  to  locally  nearest  polynomials  with  a given  multiplicity 
configuration.  The  next  step  is  to  compute  each  polynomial  from  its 
l,  and  check  that  it  is  indeed  an  appropriate  solution.  Because  no 
similar  computations  suitable  for  comparison  have  been  published, 
extra  care  was  necessary  to  be  sure  that  the  numerical  results  were 
reliable. 

It  must  be  understood  from  the  outset  that  in  general  we  can  not 
be  sure  of  having  obtained  the  global  minimum.  With  no  theoretical 
information  on  the  size  of  the  second  derivative  or  on  the  number  of 
local  minima  that  may  exist  in  a region  the  best  that  can  be  done  is 
to  obtain  as  many  local  minima  as  possible  and  examine  each.  Empiri- 
cally we  have  never  found  more  than  n + 2 local  minima  while  search- 
ing for  the  nearest  polynomial  with  a single  multiple  zero,  so  that 
task  is  not  quite  hopeless.  Furthermore,  whenever  one  might  reasonably 
expect  from  the  nature  of  a problem  that  one  minimum  would  clearly  be 
much  better  than  the  rest,  that  minimum  has  always  been  found  approxi- 
mately as  expected.  An  example  of  such  a problem  is  one  in  which  a 
perturbation  is  applied  to  a polynomial  having  one  multiple  zero  and 
several  simple  zeros,  all  well  conditioned  in  the  sense  of  chapter  II. 
Thus  the  perturbed  polynomial  has  simple  zeros  near  the  simple  zeros 
of  the  unperturbed  polynomial,  but  the  multiple  zero  has  divided  into 
several  very  ill  conditioned  zeros.  When  the  computer  codes  are  asked 
to  find  the  polynomial  with  an  appropriately  multiple  zero  nearest 
thit  perturbed  polynomial,  they  have  so  far  always  found  a locally 
closest  polynomial  with  a multiple  zero  near  the  multiple  zero  of  the 
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original  unperturbed  polynomial.  In  the  circumstances  described, 
moreover,  none  of  the  other  local  minima  are  competitive  in  distance. 
Thus  it  seems  highly  likely  that  the  best  local  minimum  is  really  the 
global  minimum. 

There  is  the  additional  complication  that  our  results  are  for 
real  polynomials  and,  as  we  have  seen  in  chapters  III  and  V,  it  is 
sometimes  necessary  to  solve  an  extra  set  of  equations  for  higher 
multiplicity  in  order  to  find  the  global  minimum.  In  our  experiencc- 
with  double  zeros,  only  once  has  a better  minimum  been  found  by  solv- 
ing the  equation  for  a triple  zero.  Thus  our  overall  results  are 
probably  not  seriously  compromised  by  failing  to  check  for  quadruple 
zeros  when  searching  for  triples,  or  for  various  higher  configurations 
when  searching  for  two  or  more  doubles. 

The  reader  may  wonder  why  it  is  so  easy  to  find  the  s's  when 
the  starting  points  are  near  ill  conditioned  zeros  of  a polynomial. 
After  all,  ill  conditioned  zeros  themselves  are  almost  by  definition 
difficult  to  find. 

The  explanation  lies  in  the  form  in  which  polynomials  are  pre- 
sented to  our  codes,  namely  as  a list  of  their  zeros.  If  the  polyno- 
mials were  represented  by  their  coefficients,  as  they  are  represented 
to  a subroutine  to  find  zeros  of  polynomials,  then  the  solutions  c 
to  the  equations  we  wish  to  solve  would  also  be  ill  conditioned  func- 
tions of  the  input  data.  But  since  ill  conditioned  zeros  are  normally 
recognizable  as  a problem  requiring  amelioration  only  when  those  zeros 
are  in  hand,  the  sensible  form  for  representing  that  ill  conditioned 
polynomial  is  by  its  zeros  rather  than  its  coefficients.  In  that  form 
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the  polynomial  may  always  be  evaluated  with  low  relative  error,  even 
near  its  zeros. 
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4.  Computed  Checks  on  Results 

Once  a 4 has  been  found,  we  can  compute  the  perturbing  polyno- 
mial q (t)  by  an  equation  such  as  (III. 6. 4).  Then  p(t)+q(T)  should 
be  locally  nearest  to  p(x)  and  should  have  a multiple  zero  c of 
the  intended  multiplicity  m,  or  several  c's  of  appropriate  multipli- 
cities if  that  was  what  was  requested. 

Analytical  errors,  approximation  errors,  coding  errors,  and 
rounding  errors  could  all  cause  the  results  to  be  other  than  expected, 
so  each  assertion  about  p + q is  checked  in  the  codes. 

Note  that  p + q is  never  represented  by  computing  the  coeffi- 
cients of  p + q.  Since  the  coefficients  of  q are  usually  intended 
to  be  small  perturbations  of  the  coefficients  of  p,  adding  them 
together  would  entail  severe  loss  of  significance.  Therefore  to 
evaluate  (p+q)(n)  at  a specific  n,  compute 

n 

p(n)  B n (n-a.) 


i*l 


and 


q(n) 


n-i 


and  then  add  p(n)  and  q(n). 

Using  this  evaluation  scheme  our  first  task  is  to  check  the  asser- 
tion that  ; is  an  m-tuple  zero  of  p + q,  i.e. 

P^U)  + q^(0  * 0 , k * 0,...,m-l  . 


We  do  not  expect  that  equation  to  be  satisfied  exactly  on  a finite 
precision  computer  so  we  compute  error  bounds  by  interval  arithmetic 
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and  ask  only  that 


P(k)U)+q(k)U) 


be  within  its  error  bound.  That  proves  that  p + q satisfies  the  con- 
straint of  lying  on  the  manifold  of  polynomials  with  m-tuple  zeros. 

The  next  assertion  to  be  checked  is  that  p + q represents  a 
stationary  point  on  the  manifold  with  respect  to  Sqll.  The  analysis 
of  chapter  III  shows  that  this  is  the  case  if  either  the  last  Lagrange 
multiplier  vanishes  or  the  multiplicity  of  c in  p + q is  at  least 
one  greater  than  requested.  For  our  codes  the  last  Lagrange  multiplier 
is  usually  forced  to  be  zero  in  the  solution  process  for  s and  q. 

If  we  wish  to  examine  other  stationary  points  which,  as  we  have  shown, 
can  not  be  minimal  with  respect  to  complex  perturbations,  we  check 
that  one  of  the  stationarity  conditions  is  satisfied. 

After  checking  stationarity  we  turn  to  minimality  of  |q|. 

Minimality  may  be  checked  by  examining  the  Hessian  matrix  of  second 

2 

derivatives  of  flql  . Given  any  fixed  there  is  a unique  q 
closest  to  p such  that  p + q has  an  m-tuple  zero  ;.  Thus  BqD 
could  be  regarded  as  a real  function  of  two  real  variables.  Re  s and 
Im  s,  for  which  partial  derivatives  can  be  computed  to  provide  a 2 by  2 
Hessian  matrix.  Alternatively  the  method  of  section  III. 10  could  be 
used  to  compute  a Hessian  matrix  for  the  coefficients  of  q and  the 
S's  which  are  now  regarded  as  independent  except  for  constraints.  To 
simplify  computation  only  real  changes  in  q and  ; were  considered 
in  computing  the  constrained  Hessian  of  dimension  n + 1 -m. 

Using  either  Hessian,  minimality  could  be  checked  by  computing 
the  signature.  Actually  the  complete  set  of  eigenvalues  was  computed 
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to  ascertain  the  shape  of  the  minimum.  Minimality  corresponds  to  all 
eigenvalues  positive;  maximality  to  all  negative;  other  configurations 
correspond  to  saddle  points. 

After  the  checks  listed  above,  the  other  zeros  of  p(x)+q(x) 
were  computed,  assuming  that  the  m-tuple  zero  c was  known.  Then  the 
n zeros  were  used  to  reconstitute  the  coefficients  of  a polynomial 
whose  coefficients  should  be  close  to  those  of  p(x)+q(x).  The 
explicit  coefficients  of  p + q were  computed  for  use  in  this  check 
only.  The  maximum  relative  difference  was  noted  and  flagged  if  larger 
than  roundoff  error  level.  If  no  flag  was  noted  then  the  zeros  of 
p + q were  assumed  to  be  reliably  computed  and  their  condition  numbers 
were  calculated.  Of  special  interest  was  the  condition  number  of  the 
multiple  zero  ? which  should  have  been  much  smaller  than  the  condi- 
tion numbers  of  the  ill  conditioned  zeros  it  replaced. 

When  computing  q and  Iql  in  cases  where  we  expect  the  last 
Lagrange  multiplier  to  be  zero,  we  usually  forced  it  to  be  zero  while 
solving  the  linear  equations  for  q.  We  could,  however,  solve  a 
system  of  linear  equations  of  dimension  one  larger.  Then,  because  of 
rounding  error,  we  expect  the  last  Lagrange  multiplier  to  be  small  but 
not  zero.  So  as  a check  we  re-computed  q and  Iql  using  the  non- 
zero multiplier.  The  two  values  of  Iql  are  compared  and  flagged  if 
they  differ  by  more  than  a few  units  in  the  last  place  of  precision. 

Finally  a number  of  random  small  perturbations  of  ; were  made 
and  the  distance  to  the  nearest  polynomial  with  the  perturbed  ; as  a 
multiple  zero  was  computed.  Since  the  original  ; was  alleged  to  be 
a minimal  point,  a message  was  printed  if  any  of  the  nearby  polynomials 
were  significantly  closer  to  p. 
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AH  the  experimental  results  to  be  presented  in  this  chapter  and 
the  next  satisfied  these  checks  unless  otherwise  stated.  Thus  there 
is  a basis  for  confidence  that  the  various  complicated  equations  that 

were  solved  for  one  or  more  fs  were  in  fact  formulated  and  solved 
correctly. 
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5.  Setting  Up  a Problem  with  a Known  Solution 

While  developing  computer  codes  it  is  sometimes  desirable  to 
solve  a problem  whose  answer  is  known.  Although  it  is  not  known  how, 
for  instance,  to  set  up  a polynomial  such  that  the  globally  nearest 
polynomial  with  an  m-tuple  zero  has  the  m-tuple  zero  we  specified  in 
advance,  it  is  a simple  matter  to  set  up  such  a polynomial  so  that  a 
locally  nearest  polynomial  has  that  specified  m-tuple  zero. 

One's  first  thought  might  be  to  start  with  a trivial  problem  whose 
solution  is  known  and  apply  a random  perturbation.  This  is  done  for 
some  problems  described  in  the  next  chapter.  For  instance,  a small 
random  perturbation  may  be  applied  to  the  coefficients  of  a polynomial 
with  a double  zero  to  obtain  a nearby  polynomial  with  two  ill  condi- 
tioned zeros.  Then  the  computer  codes  find  that  the  nearest  polynomial 
with  a double  zero  has  a double  zero  near  the  one  we  started  with. 

Figure  VIII . 1 shows  why  the  double  zero  is  not  the  same;  a perturba- 
tion in  a random  direction  is  not  generally  "orthogonal"  to  the  sur- 
face. The  change  in  the  multiple  zero  is  usually  commensurate  with 
the  size  of  the  perturbation  when  the  multiple  zero  is  well  conditioned. 

It  is  possible  to  set  up  a perturbation  so  we  return  to  a speci- 
fic multiple  zero,  however.  Recall  the  equation,  (III. 6. 2),  to  be 
solved  for  the  polynomial  nearest  p with  an  m-tuple  zero 

f(;,p)  * det(Ap  • AW”1 A*Z)  = 0 . 

Z is  a constant  truncator  matrix,  W depends  only  on  the  norm,  and 
A and  A depend  only  on  ;. 

Normally  p is  given  and  we  seek  ; by  solving  a highly  non- 
linear equation.  But  now  we  wish  to  find  p given  From  the 
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p + q 


Figure  VIII. 1. 


tn TAh  t1P  6 Zer0‘  A random  Perturbation 

^ nSi?Sn?UCeS  P'  nP  + q is  the  Polynomial  with 
a multiple  zero  closest  to  p. 
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properties  of  determinants  it  is  apparent  that  f(?,p)  is  a linear 
functional  of  the  vector  p,  so  f(c»p)  = u^*p  for  some  u^*  which 
depends  on  5 but  not  p.  Then  to  find  such  a p it  is  only  neces- 
sary to  obtain  one  of  the  members  of  the  (n-1)  dimensional  subspace 
of  solutions  of  u^*p  = 0. 

As  an  example,  suppose  we  wish  to  start  with  a polynomial  p 
with  a double  zero  at  a,  so  f(ot,p)  = 0.  We  then  want  to  find  a q 
such  that  p + q has  a locally  nearest  polynomial  with  a double  zero 
at  a.  Presumably  that  nearest  polynomial  would  be  p if  q is  not 
too  large. 

We  find  then  that  f(a,q)  = 0 is  the  requirement  on  q.  We  can 
find  such  a q by  letting  q^  be  a polynomial  with  random  coeffi- 
cients and  q.  be  the  constant  polynomial  whose  value  is  1.  Then 

. , - 
q ' qo  • qi 

is  the  polynomial  we  seek.  It  may  be  verified  that  f(a,q.|)  f 0 for 
m * 2 or  3. 

Then  we  may  apply  the  computer  codes  to  p + q to  verify  that 
they  do  find  a locally  nearest  polynomial  with  an  m-tuple  zero  a. 

We  could  impose  an  even  more  stringent  requirement:  that  the 
closest  polynomial  to  p + q with  a multiple  zero  be  p itself.  This 
is  just  as  easy  to  arrange.  Recall  the  notation  from  chapter  III  for 
finding  the  polynomial  p + q with  a multiple  zero  c nearest  a poly- 
nomial p.  For  our  present  purpose  p * p + q and  p + q ■ p so 
q * -q.  But 


q « 
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for  some  m-1  dimensional  vector  £ of  Lagrange  multipliers.  So  our 
recipe  is:  choose  any  random  m-1  dimensional  vector  0 and  let 

p + q = p -W_1A*u 


be  the  perturbed  polynomial.  Then  we  may  verify  that  the  equation 
for  C, 

det(Ap  ; AVf]A*Z)  = 0 , 


is  trivially  solved  when  ; = a,  for  then 


where 


Ap  = - AW-1A*u 
= - AW_1Au 


The  matrix  whose  determinant  we  seek  is  just 

AW-1A*(u  \ Z) 


and  the  bottom  row  of  the  rightmost  factor  vanishes  as  does  the 
determinant. 

A 

When  solving  for  Lagrange  multipliers  £, 

A _1 A A £ A _1 A A 

AW  'A*L  * -Ap  * AW  A*u  , 

and  since  the  rows  of  AW~^A*  are  linearly  independent,  £ * u as  we 
hoped,  and  q « -q.  Thus 


{q|q  * -W-1A*u) 


is  indeed  the  subspatc  of  perturbations  of 
locally  nearest  polynomial  with  an  m- tuple 


for  which 


zero. 


CHAPTER  IX 


NONPATHOLOGICAL  EXPERIMENTAL  RESULTS 


1 . Introduction 

We  turn  now  to  presentation  of  some  results  of  calculations  per- 
formed on  specific  polynomials.  The  results  in  this  chapter  generally 
tend  to  vindicate  the  theory. 

Calculations  were  usually  based  on  the  methods  described  in  the 
previous  chapter.  The  norms  used  were  weighted  least  squares  norms 
intended  to  minimize  relative  changes  in  the  coefficients  of  the 
starting  polynomial.  Thus  if  the  monic  starting  polynomial  of 
degree  n were 


p(f) 


n n n „ . 

n (t-ou)  * rn+  l p .xn”^ 
1*1  1 j*l  J 


then  polynomials 

q(t)  - Iq/'1 

j-1  J 


would  be  sought  such  that  p + q had  the  desired  multiplicity  struc- 
ture and 

5 j/'jKjl2 


was  minimized.  Usually  w^  ■ V|Pj|*  but  sometimes  ■ l/|Pjl* 
was  used  instead,  where 


6{t)  = n (t-  |oJ)  . 
i-1  1 
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The  latter  norm  is  applicable  when  some  of  the  p^'s  vanish. 

The  choice  of  norm  also  affects  the  condition  numbers.  Generally 
condition  numbers  for  relative  changes  in  the  zeros  are  used. 

In  the  first  cases  the  "right  answer"  is  obvious  and  the  codes  do 
indeed  recover  that  answer. 
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2.  n-tuple  Zeros 

Equations  for  finding  the  nearest  polynomial  with  an  n-tuple  zero 
are  given  in  section  III. 2.  The  present  example  was  created  by  ran- 
domly perturbing  a polynomial  whose  quintuple  zero  1 has  condition 
number  .135.  A perturbation  of  norm  .749E-12  was  applied  in  a 
random  direction  to  create  p whose  zeros  are 

.99557908  ± . 32081 885E-2  i 
1.00168511  ± . 52020041 E-2  i 
1.00547160  . 

The  condition  numbers  of  these  zeros  vary  from  .353E+10  to 
.357E+10.  The  equations  for  finding  the  nearest  polynomial  with  an 
n-tuple  zero  were  solved  by  Newton's  method,  starting  from  the  arith- 
metic mean  of  the  five  zeros  of  p(x).  The  result  was  that  the 
nearest  polynomial  with  an  n-tuple  zero  had  the  n-tuple  zero 
1.000000000000007  with  condition  number  .135. 

Corresponding  results  were  obtained  for  similar  polynomials  of 
degrees  8 and  20.  Although  the  n-tuple  zero  is  easy  to  find,  the 
nearest  polynomial  with  a real  double  or  triple  zero  is  sometimes 
difficult  to  locate,  especially  if  n is  odd.  There  are  usually 
numerous  nearby  polynomials  with  a complex  double  zero,  and  for  some 
of  these  may  be  found  a nearby  real  polynomial  with  a complex  conju- 
gate pair  of  double  zeros. 
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3.  Returning  to  a Double  Zero 

The  next  polynomial  has  six  zeros  -2,  -1,  1,  1,  2,  and  3.  The 
worst  conditioned  of  these  is  3,  with  condition  number  43.4.  The 
double  zero  at  1 has  condition  number  5.04. 

A random  perturbation  of  norm  . 438E-8  was  applied,  creating  a 
polynomial  p: 

Zero  Condition  number 

-2.00000000  2.89 

-1.00000000  2.91 

.99999998  ± . 1 046251 3E-3  i .557E+5 

2.00000011  19.5 

2.99999992  43.4 

The  methods  of  chapter  III  were  applied  to  find  the  nearest  poly- 
nomial wi th  a double  zero,  and  a polynomial  p + q was  soon  found 
whose  doui  ,e  zero  at  .99999998  has  condition  number  5.04.  DqD  = .94E-9 
and  the  other  zeros  were  not  changed  by  more  than  .0000007. 

Other  locally  minimal  polynomials  with  double  zeros  were  also 
found.  For  instance  the  next  closest  one  has  a double  zero  at  2.5397 
with  condition  number  3.85,  and  the  worst  conditioned  zeros  of  p + q 
are  . 952  ± . 1 58i , with  condition  numbers  28.6.  But  BqE  = .385E-2, 
so  this  perturbation  is  over  a million  times  larger  than  the  previous 
one.  By  taking  such  a large  step  we  manage  to  decrease  the  worst  con- 
dition number  only  by  a factor  of  2,  and  this  perturbation  seems  much 
less  natural  than  the  previous  one. 

Similarly  when  we  seek  the  nearest  polynomial  with  a triple  zero, 
we  find  we  must  let  D ql  = .017  in  order  to  reach  the  polynomial  with 
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a triple  zero  at  1.20.  The  worst  conditioned  zero  of  that  polynomial 
has  condition  number  8.64. 

Thus  we  find  that  by  forcing  a large  enough  perturbation  on  p 
we  can  make  its  zeros  as  well  conditioned  as  we  want.  However  in  this 
case  we  find  that  there  is  an  "obvious"  perturbation  in  which  a com- 
paratively small  change  in  p results  in  a comparatively  large  improve- 
ment in  the  worst  condition  of  p's  zeros. 
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4.  Returning  to  a Triple  Zero 

We  start  with  the  polynomial  with  simple  zeros  -2,  -1,  and  3,  and 
triple  zero  1.  The  condition  of  the  triple  zero  is  .797  and  the  worst 
zero  is  3,  with  condition  number  5,52. 

Apply  a random  perturbation  of  norm  .839E-10  to  find  p,  a poly- 
nomial whose  zeros  and  condition  numbers  are 


-2.00000000  1.21 

-1.00000000  .615 

. 99980426  ± . 33876727E-3  i . 357E+7 

1.00039148  . 357E+7 

2.99999999  5.52 


When  we  search  for  nearby  polynomials  with  double  zeros,  we  find 
for  instance  one  with  a double  zero  .99999525  at  distance  .365E-10. 

The  condition  of  that  double  zero  is  somewhat  improved  to  .714E+5  but 
the  condition  of  the  third  zero  near  1 becomes  .807E+10.  Even  though 
we  can  reach  a double  zero  in  a small  step,  the  results  are  not 
interesting. 

When  we  search  for  a nearby  triple  zero,  however,  we  find  that  a 
perturbation  of  norm  .495E-10  gets  us  to  a polynomial  with  a triple 
zero  1.000000000014  with  condition  number  .797,  The  worst  zero  has 
condition  number  5.52.  Comparing  to  the  perturbation  to  a double 
zero,  we  find  that  a not  much  larger  perturbation  to  a higher  multi- 
plicity structure  yields  a substantial  improvement  in  condition. 

Computer  codes  for  quadruple  zeros  are  not  available  but  it  seems 
doubtful  that  this  p could  be  perturbed  to  a quadruple  zero  by  a 
further  small  perturbation. 
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5<  temjng  to  Two  Dn.ihlp  7^ 


The  Polynomial  with  simple  zeros  -2  0 anH  9 w , 

1 ari . ,,  ’ and  2*  a"d  double  zeros 

"d  +1  was  perturbed  by  a random  perturbation  of  norm  .332E-7  to 

produce  a polynomial  whose  ill  conditioned  zeros  were 

-562407E-4  i with  condition  numbers  . ,96M 

A Poiynorta,  at  distance  . 143E-7  had  a double  zero  but  two 
remaining  ill  conditioned  zernc  Tha 

„ „ , °S-  There  was  a Polynomial  with  a triple 

zero  at  distance  .629  with  all  zeros  well  conrfit-  w 
, eros  wen  conditioned.  But  the  satis- 

factory  polynomial  had  two  double  zeros  at  a .999999997.  A11  2er0s 
ell  conditioned  but  the  perturbation  q was  only  .219E-7  in 

nnrm 
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6.  Returning  to  a Complex  Conjugate  Pair  of  Double  Zeros 

Consider  the  eighth  degree  polynomial  whose  simple  zeros  are  -3, 
-2,  -1,  and  4,  and  which  also  has  double  zeros  at  2±i.  The  worst 
zero  is  4,  with  condition  number  55,0;  the  condition  of  the  complex 
zeros  is  7.98. 

A random  perturbation  of  norm  .168E-8  produces  a polynomial  p 
whose  zeros  and  condition  numbers  are 


-3.00000000  9.98 

-2.00000000  14.9 

- .99999999  6.12 

1.99982354  ± 1.00012355  i .126E+6 

2.00017652  ± . 99987637  i .126E+6 

3.99999984  55.0 


When  we  apply  the  methods  of  chapter  IV  we  discover  that  there  is 
a real  polynomial  p + q with  double  zeros  at  2.000000012  ± .9999999946  i 
with  condition  numbers  7.98.  IqS  is  .459E-9  and  the  worst  zero  is 
3.9999993  with  condition  number  55.0. 

Thus  in  the  case  of  a complex  conjugate  double  zero  we  can  also 
find  the  answer  when  it  is  obvious.  In  this  case  no  real  double  or 
triple  zeros  were  found  closer  than  .001.  Of  course  there  is  no  theo- 
retical basis  for  asserting  that  they  do  not  exist  — but  if  they  are, 
they  must  be  rather  well  hidden! 


7.  A Polynomial  with  Several  Pairs  of  Complex  Conjugate  Zeros 

Wilkinson  presents  a real  polynomial  [34,  p.  63]  all  of  whose  16 
zeros  are  complex,  most  being  rather  ill  conditioned.  Condition  num- 
bers range  from  .878  to  .107E+11, 

No  real  £'s  were  found  other  than  0,  but  7 complex  s's  corres- 
ponding to  complex  perturbations  were  found.  All  of  these  complex  c's 
lead  to  nearby  real  polynomials  with  complex  conjugate  pairs  of  double 
zeros.  The  closest  of  these  is  at  a distance  of  .247E-13  and  the 
worst  conditioned  zero  of  the  perturbed  polynomial  has  a condition 
number  of  .551E+10.  So  from  the  point  of  view  of  "explanation," 
clearly  some  higher  multiplicity  configuration  is  required.  The  value 
of  this  example  is  rather  that  it  shows  that  the  codes  are  capable  of 
finding  a number  of  complex  conjugate  pairs  of  double  zeros  when  the 
problem  is  of  a nature  that  several  such  solutions  might  reasonably 
be  expected. 

In  the  table  below  we  list  the  unperturbed  zeros  a and  their 
condition  numbers  on  the  left  and,  on  the  right,  Iql,  s,  the  condi- 
tion of  C,  and  the  worst  condition  number  of  the  perturbed  polynomial. 


Real 

Imag 

Cond(a) 

Iql 

.305E-5 

.312 

.565E+10 

. 247E-1 3 

.148E-4 

.312 

.107E+11 

.545E-13 

. 471 E-4 

.311 

.646E+10 

.329E-12 

. 143E-3 

.309 

.154E+10 

.644E-11 

.491E-3 

.304 

.127E+9 

. 656E-9 

. 232E-2 

.293 

. 233E+7 

.111E-5 

. 187E-1 

.253 

.297E+4 

.1 34E-1 

.132 

.136 

.878 

.141E+1 

Cond(s) 

worst 

-.884E-5 

.312 

. 1 1 OE+8 

.551E+10 

-. 354E-4 

.311 

.199E+8 

.277E+10 

-.116E-3 

.309 

. 1 08E+8 

.137E+10 

- . 41 7E-3 

.306 

. 1 96E+7 

. 1 68E+9 

- . 201 E-2 

.295 

.852E+5 

. 382E+7 

- . 1 66E-1 

.260 

. 381 E+3 

.710E+4 

-.121 

.162 

.483 

. 222E+2 

0 

0 

.036 

.227E+2 
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8.  An  Uninteresting  Polynomial 

In  contrast  to  the  previous  examples,  we  consider  now  a polynomial 
all  of  whose  zeros  are  well  conditioned,  just  to  see  how  the  manifold 
of  double  zeros  appears  from  a distance. 

Let  p be  a cubic  polynomial  with  zeros  1,  2,  and  3,  and  condi- 
tion numbers  .87,  4.6,  and  4.8.  For  this  example  we  use  the  uniform 
norm  for  which  all  weights  are  1.  After  a lengthy  search  we  find  the 
following  interesting  points: 


4 

II  qS 

Worst  condition 

Double  at  2.49244540 

.0551 

.72 

Double  at  1.32286845 

.152 

2.7 

Double  at  0.0 

12.53 

1.0 

Double  at  -3.20829919 

12.57 

.15 

Double  at  -1.13700604 

13.93 

10.4 

Triple  at  1.87492441 

57.18 

.99E-2 

Of  these  points,  0.0  turned  out  not  to  be  a stationary  point,  and 
-1.13...  turned  out  to  be  a maximum  on  the  real  axis,  and  a saddle 
point  in  the  complex  plane.  The  point  1.87...  represents  a minimum 
among  perturbations  to  a real  zero  but  a maximum  among  real  perturba- 
tions to  a double  zero.  The  other  three  points  are  local  minima  in 
the  complex  plane. 

This  example  supports  the  conclusion  in  chapter  II  that  absence 
of  ill  condition  implies  distance  from  the  manifolds  of  polynomials 
with  multiple  zeros. 
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9.  Zeros  in  a Circle 

The  next  example  is  a polynomial  mentioned  by  Wilkinson  [34]. 

Its  zeros  lie  around  the  unit  circle  and  are  the  twenty  20th  roots  of 
unity.  In  the  uniform  norm  the  zeros  are  all  very  well  conditioned; 
the  real  zeros  have  condition  numbers  .224  and  the  complex  zeros  have 
slightly  smaller  condition  numbers,  since  only  real  perturbations  are 
considered.  Our  codes  were  unable  to  find  any  solutions  for  double 
zeros  other  than  zero  or  for  complex  conjugate  pairs  except  by  great 
labor,  which  produced  unsatisfactory  results.  It  turns  out  that 

p(t)  = Tn  - 8 , 

8 real  and  positive,  has  non-zero  solutions  c constrained  as  follows 
for  double  zeros: 

8/n  < U|n  < (n-l)B  , 

arg  c 3 (2k+l)-rr/n  , k * 0,1,..., n-1  . 

Thus  argUn)  = tt  and  if  n is  even  there  are  no  rral  solutions  £. 
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10.  Summary 

The  results  presented  in  this  chapter  and  other  similar  results 
lead  to  the  following  conclusions: 

1.  When  there  is  an  "obvious"  nearby  polynomial  of  a certain 
multiplicity  structure,  the  computer  codes  find  it.  If  insufficient 
multiplicity  is  requested,  the  codes  find  a polynomial  that  is  close 
but  has  some  zeros  still  very  ill  conditioned.  When  too  much  multi- 
plicity is  requested,  the  codes  find  a polynomial  that  is  relatively 
far  away  although  all  its  zeros  are  well  conditioned.  When  the  proper 
multiplicity  is  specified,  the  codes  find  a polynomial  which  is  rela- 
tively close  and  has  all  zeros  well  conditioned. 

2.  When  there  is  no  obvious  reason  why  a nearby  polynomial  would 
have  substantially  better  conditioned  zeros,  the  codes  do  not  find  any 
such  polynomials. 

3.  The  polynomials  that  the  codes  find  are  indeed  critical 
points  for  Rql  and  are  usually  minima.  In  other  words,  the  answers 
are  correct,  but  the  codes  may  not  be  able  to  find  all  the  answers. 

With  conclusions  like  these,  based  on  simple  cases,  we  have  some 
basis  for  confidence  in  examining  a more  difficult  polynomial  in  the 
next  chapter. 


CHAPTER  X 


WHAT'S  WRONG  WITH  WILKINSON'S  POLYNOMIAL? 

1 . Wilkinson's  Polynomial 

In  [34]  J.  Wilkinson  describes  the  astonishing  ill  condition  of  a 
polynomial  whose  zeros  are  the  integers  from  ! through  20.  He  observed 
that  by  changing  one  of  the  coefficients  by  less  than  one  part  in 
1.0E+15  it  was  possible  to  create  a polynomial  some  of  whose  zeros 
were  complex  conjugate  pairs. 

Our  results  in  chapter  II  lead  us  to  conclude  that  this  badly 
behaved  polynomial  must  be  near  the  manifold  of  polynomials  with  double 
zeros,  at  least,  and  perhaps  near  manifolds  c c responding  to  higher 
multiplicity  configurations  as  well.  Since  chic  polynomial  is  pre- 
cisely defined,  we  are  not  interested  in  " -mel iorating"  its  ill  condi- 
tion but  rather  "explaining"  that  ill  condition  if  possible.  The 
results  mentioned  in  the  previous  chapter  show  that  ill  condition  is 
Ideally  explained  by  displaying  a small  pertubation  to  a nearby  mani- 
fold of  polynomials  with  some  appropriate  multiplicity  configuration. 

We  shall  see  that  the  experimental  results  presently  available  do  not 
support  any  such  simple  explanation  for  Wilkinson’s  polynomial;  rather 
they  suggest  that  it  Is  near  a place  where  the  manifolds  of  polynomials 
with  multiple  zeros  are  especially  contorted. 

After  examining  the  well  known  Wilkinson  polynomial  we  will  look 
briefly  at  Its  translation  to  the  origin  and  at  another  Wilkinson  poly- 
nomial which  is  in  some  ways  the  opposite  of  the  first. 


249 


250 


2.  Coefficients  and  Condition  Numbers  for  Wilkinson's  Polynomial 

Two  unusual  things  about  Wilkinson's  polynomial  are  the  ranges  in 
magnitude  among  the  coefficients  and  among  the  condition  numbers  of 
the  zeros. 

The  zeros  are  the  integers  from  1 through  20.  Therefore  the 
coefficients  are  exactly  computable,  but  as  a practical  matter  most 
have  so  many  significant  figures  that  they  must  be  rounded  to  fit  in 
48  bits  of  a CDC  computer  word.  Consequently  the  polynomial  should  be 
considered  to  be  defined  by  its  zeros,  and  the  coefficients  are  only 
used  to  compute  the  weights  in  the  norm  on  perturbations: 

p(t)  = n (t-i)  = tn+  l p.Tn_j 

isl  j-1  J 

iqi2  * l Wjhjl2 

wj  * viPji2 

This  "relative"  norm  measures  relative  changes  in  the  coefficients  of 
p;  we  will  also  present  results  for  the  "uniform"  norm  in  which  all 
the  weights  are  1 and  which  measures  absolute  changes  in  the  coeffi- 
cients of  p. 

Some  differences  between  these  norms  might  be  expected  due  to  the 
large  variation  in  those  coefficients.  In  magnitude  they  range  from 
210  to  1E19;  they  are  listed  in  Table  X.2.  Thus  the  corresponding 
weights  for  the  relative  norm  range  from  1E4  to  1E38. 

The  ze^os  are  given  in  Table  X.l  with  their  condition  numbers. 

The  first  condition  number  is  with  respect  to  the  uniform  norm  on  the 
polynomial.  The  second  condition  number  is  with  respect  to  the  rela- 
tive norm  on  the  polynomial.  All  condition  numbers  are  for  absolute 


Table  X.l 


Zeros  of  Wilkinson' 


s Polynomial  and  Their  Condition  Numbers 


Uniform  Norm 


1 

. 368E-16 

2 

. 946E-10 

3 

. 1 73E-5 

4 

.226E-2 

5 

.620 

6 

•591E+2 

7 

.257E+4 

8 

.602E+5 

9 

.845E+6 

10 

. 763E+7 

11 

.466E+8 

12 

. 199E+9 

13 

. 607E+9 

14 

•134E+10 

15 

.212E+10 

16 

.241E+10 

17 

•191E+10 

18 

. 997E+9 

19 

. 309E+9 

20 

•432E+8 

Relative  Norm 

. 187E+3 

.355E+5 

.234E+7 

. 778E+8 

. 1 53E+10 

. 198E+11 

• 1 77E+12 

•115E+13 

•553E+13 

.203E+14 

.572E+14 

. 125E+15 

.212E+15 

.278E+15 

.279E+15 

.Z10E+15 

. 114E+15 

•428E+14 

.979E+13 

.103E+13 
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changes  in  the  zeros.  The  condition  number  for  relative  changes  in  a, 
say,  may  be  obtained  by  dividing  the  listed  condition  number  by  |a|. 

The  most  striking  facts  about  the  condition  numbers  are 

1)  the  magnitude  of  the  ill  condition  of  the  worst, 

2)  the  large  group  of  zeros  that  are  nearly  as  badly  condi- 
tioned as  the  worst,  and 

3)  the  lack  of  any  obvious  partitioning  into  a set  of  well  con- 
ditioned zeros  and  a set  of  ill  conditioned  ones. 

The  last  fact  distinguishes  this  polynomial  from  those  of  the 
previous  chapter.  There  is  no  obviously  best  multiplicity  configura- 
tion that  we  should  look  for.  So  we  will  try  as  many  as  we  can, 
starting  from  the  simplest. 

Before  giving  the  results,  it  is  instructive  to  attempt  to  graph 
this  polynomial.  It  turns  out  to  be  impossible  to  perceive  all  its 
features  on  one  graph,  so  we  present  several  successive  magnifications 
of  interesting  parts.  Figures  X.1-X.4  were  produced  on  a Tektronix  4051 
Graphics  System. 

It  is  interesting  to  note  that  the  symmetry  of  the  polynomial 
about  10.5  is  not  reflected  in  the  condition  numbers,  which  rpach  their 
maximum  near  15,  depending  on  the  norm.  That  is  because  the  formula 
in  chapter  II  for  the  condition  number  of  a zero  a has  a numerator 
which  is  a monotonic  increasing  function  of  | a j and  a denominator 
that  depends  only  on  the  absolute  spacing  between  a and  the  other 
zeros.  The  numerator  is  a rather  rapidly  increasing  function  of  |a|; 
for  a simple  zero,  it  is 
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Intuitively  it  is  hard  to  understand  why  the  larger  zeros  should 
be  so  much  more  ill  conditioned  than  the  smaller  ones.  Indeed,  by 
translating  the  entire  polynomial  by  -10.5  so  that  it  is  symmetric 
about  the  origin,  one  can  eliminate  that  part  of  the  anomaly.  Wilkinson 
did  so  and  found  substantial  overall  improvement  in  the  condition  of 
the  zeros.  Of  course,  if  that  translation  were  regarded  as  a pertur- 
bation, its  norm  would  exceed  1 in  the  relative  norm  and  1 El 9 in  the 
uniform  norm,  and  we  know  that  remarkable  improvements  in  condition 
often  accompany  large  perturbations. 

We  wish,  however,  to  study  Wilkinson's  polynomial  as  an  untrans- 
lated object.  The  next  section  gives  our  results.  Some  results  for 
the  translated  polynomial  are  in  a later  section. 
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Figure  X.3.  Wilkinson's  polynomial  on  [3,1 
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3.  The  Nearest  Polynomial  with  a Double  Zero 

There  are  many  polynomials  with  a double  zero  that  are  close  to 
Wilkinson's  polynomial.  In  the  next  section  we  will  list  some  of  them. 
In  the  present  section  we  will  just  present  the  facts  about  the 
closest  known  such  polynomials  in  each  norm. 

In  the  relative  norm  the  nearest  polynomial  on  the  manifold  has 


a double  zero  at 

14.499...  . The 

distance  llqll  is 

. 1 1 054E-1 4 . The 

double  zero  and 

some  of  the  nearby 

simple  zeros  are 

listed  along  with 

their  condition 

numbers  and  their 

condition  numbers 

prior  to  pertur- 

bat ion. 

Unperturbed  zero  and  condition 

Perturbed  zero  and  condition 

12 

. 125E+1 5 

12.15289 

. 174E+15 

13 

.212E+15 

12.77240 

.225E+15 

14 

.278E+15 

14.49963 

.963E+13 

15 

.279E+15 

14.49963 

.963E+13 

16 

.210E+15 

16.22347 

.215E+15 

17 

.114E+15 

16.85795 

.159E+15 

The  coefficients  of  q are  in  Table  X.2. 

The  corresponding  distance  in  the  uniform  norm 

is  . 13481E-9. 

Unperturbed  zero  and  condition 

Perturbed  zero  and  condition 

13 

.607E+9 

13.09030 

.753E+9 

14 

.134E+10 

13.83087 

.123E+10 

15 

.212E+10 

15.48653 

.325E+7 

16 

.241E+10 

15.48653 

.325E+7 

17 

. 191E+10 

17.25351 

.205E+10 

18 

.997E+9 

17.83934 

. 152E+10 
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In  both  cases  we  find  that  moving  to  the  manifold  of  double  zeros 
improved  the  condition  of  the  coalescing  zeros  appreciably,  and  thereby 
improved  the  overall  condition  of  the  polynomial.  But  some  of  the 
nearby  zeros  actually  worsened  slightly  in  condition.  Evidently 
moving  to  an  even  higher  manifold  is  in  order. 


Table  X.2 


Coefficients  of  Wilkinson's  Polynomial  and  of  the  Perturbations 
to  the  Nearest  Polynomial  with  a Double  Zero 


3 

PJ 

q.,  uniform  norm 

v 

q.,  relative  norm 

J 

1 

-210 

. 1 3452E-9 

-.29637E-15 

2 

20615 

. 86866E- 1 1 

- . 1 9697E-1 2 

3 

-1256850 

.-N 

. 56091E-12 

- . 50496E- 1 0 

4 

.53327 

96400 

E+8 

.36219E-13 

-.62696E-8 

5 

-.16722 

80820 

E+10 

. 23388E-1 4 

-.42520E-6 

6 

.40171 

77163 

E+ll 

.151 02E-1 5 

-. 16922E-4 

7 

-.75611 

11845 

E+l  2 

. 9751 6E-1 7 

- . 41 346E-3 

8 

.11310 

27700 

E+14 

. 62969E-1 8 

-.63804E-2 

9 

-.13558 

51829 

E+l  5 

.40660E-19 

-.63237E-1 

10 

.13075 

35010 

E+l  6 

. 26255E-20 

-.40560 

11 

-.10142 

29987 

E+l  7 

. 16954E-21 

-1.68308 

12 

.63030 

81210 

E+l  7 

. 10947E-22 

-4.48311 

13 

-.31133 

36432 

E+l  8 

. 70689E-24 

-7.54344 

14 

.12066 

47804 

E+l  9 

.45646E-25 

-7.81487 

15 

-.35999 

79518 

E+l  9 

.29474E-26 

-4.79737 

16 

.80378 

11823 

E+l  9 

. 1 9032E-27 

-1.64939 

17 

-.12870 

93125 

E+20 

. 1 2290E-28 

-.29168 

18 

.13803 

75975 

E+20 

. 79357E-30 

- . 231 38E-1 

19 

-.87529 

48037 

E+l  9 

. 5*1 242E-31 

- . 641 63E-3 

20 

.24329 

02008 

E+l  9 

. 33088E-32 

- . 341 88E-5 
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4.  Interesting  Polynomials  near  Wilkinson's 

Tables  X.3  and  X.4  list  a number  of  interesting  polynomials  near 
Wilkinson's  which  have  one  or  more  multiple  zeros.  The  first  columns 
list  the  distance  to  the  polynomial  from  Wilkinson's,  Iql,  and  the 
multiple  zeros  All  multiple  zeros  are  double  except  those  marked 
(3)  which  are  triple.  In  the  last  columns  are  listed  the  worst  condi- 
tion number  of  a multiple  zero  ; and  the  worst  condition  number  among 
the  simple  zeros. 

Table  X.3  is  based  on  relative  changes  in  the  coefficients  of 
Wilkinson's  polynomial.  Table  X.4  is  based  on  the  uniform  norm  in 
which  all  the  weights  are  1.  Some  of  the  entries  are  incomplete;  to 
conserve  paper  some  of  the  computer  codes  involved  did  not  print  all 
details  for  some  of  the  less  interesting  polynomials. 

All  the  polynomials  listed  represent  solutions  of  equations  pre- 
sented in  chapters  III-V.  Most  of  the  solutions  are  local  minima. 

The  likely  candidates  for  global  minima  in  each  category  are  indicated 
by  *. 

There  are  apparently  a very  large  number  of  solutions  for  the 
cases  of  2,  3,  or  4 double  zeros.  To  keep  computing  expenses  in  bounds 
it  was  necessary  to  discontinue  the  computation  after  a certain  arbi- 
trary number,  usually  20,  of  these  solutions  had  been  found.  Even 
these  are  not  all  listed  in  the  tables;  some  were  omitted  whose  norms 
are  larger  than  those  listed. 
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Table  X.3 


Interesting  Polynomials  Near  Wilkinson1 s.  Relative  Norm 


flqli 

?'s 

Worst  condition  numbers 
Multiple  zero  Simple  zero 

unperturbed 

polynomial 

.279E+15 

* 1. 

.110E-14 

14.4996 

.963E+13 

.225E+15 

2. 

. 127E-14 

15.5295 

.771E+13 

. 122E+16 

3. 

. 127E-14 

13.472 

. 895E+1 3 

.124E+16 

* 4. 

.128E-14 

13.471,  15.531 

.895E+13 

.113F+15 

5. 

.192E-14 

12.446 

.631E+13 

. 349E+1 5 

6. 

.201 E-14 

16.562 

.444E+13 

.573E+15 

7. 

.202E-14 

12.442,  16.563 

.629E+13 

.196E+15 

8. 

.376E-14 

11.420 

.340E+13 

.111E+15 

9. 

.392E-14 

12.467,  14.535 

.963E+13 

.188E+15 

10. 

•454E-14 

17.600 

.173E+13 

.449E+1F 

11. 

•454E-14 

11.413,  17.600 

.337E+13 

.825E-*  14 

12. 

.485E-14 

14.454,  16.543 

.977E+13 

.185E+15 

13. 

.615E-14 

11.436,  15.573 

.751E+13 

.156E+15 

14. 

.889E-14 

13.397,  17.587 

.886E+13 

.120E+15 

15. 

.956E-14 

10.396 

.141E+13 

.459E+14 

16. 

.101E-13 

12.509,  15.451 

.841E+13 

.218E+15 

17. 

.104E-13 

11.454,  13.565 

.965E+13 

.301E+15 

18. 

.110E-13 

11.454,  16.516 

.479E+13 

.711E+15 

19. 

.112E-13 

13.573,  16.513 

.980E+13 

.120E+16 

*20. 

.112E-13 

11.451,  13.572, 

16.513 

. 980E+1 3 

.224E+15 

21. 

.131E-13 

10.407,  16.612 

.417E+13 

.816E+14 

22. 

.133E-13 

12.527,  17.579 

. 706E+1 3 

.200E+15 

23. 

. 144E-13 

11.468,  14.361 

.108E+14 

.201E+15 

24. 

.162E-13 

18.646 

.410E+12 

.534E+14 

25. 

.163E-13 

10.381,  18.645 

.137E+13 

.274t+lA 

26. 

.176E-13 

15.383,  17.570 

.944E+13 

.240E+15 

27. 

.182E-13 

10.417,  14.668 

.106E+14 

.214E+15 

28. 

.197E-13 

10.419,  15.378 

.968E+13 

.307E+15 

29. 

.197E-13 

10.417,  17.571 

.186E+13 

.874E+15 

30. 

. 197E-1 3 

10.418,  15.373, 

17.572 

.978E+13 

.250E+15 

Table  X.3  (continued) 


flqll 

C's 

Worst  condition  numbers 
Multiple  zero  Simple  zero 

31. 

.210E-13 

14.708,  17.564 

. 114E+14 

. 360E+15 

32. 

. 210E-13 

10.412,  14.708,  17.564 

. 114E+14 

. 250E+15 

33. 

•261E-13 

12.475,  15.407,  17.565 

.881E+13 

.231E+15 

34. 

. 264E-1 3 

12.300,  18.638 

.597E+13 

.619E+14 

35. 

.272E-13 

11.487,  14.417,  16.559 

. 104E+14 

.178E+15 

36. 

. 298E-1 3 

12.496,  14.492,  16.525 

. 926E+13 

.532E+14 

37. 

•317E-13 

9.372 

.447E+12 

. 164E+14 

*38. 

. 341E-13 

13.978(3) 

.482E+12 

.431E+14 

39. 

. 343E-13 

11.474,  13.514,  15.494 

.876E+13 

.114E+15 

40. 

.367E-13 

15.038(3) 

.413E+12 

.466E+14 

41. 

.423E-13 

12.921(3) 

.414E+12 

. 344E+1 4 

42. 

. 547E-1 3 

16.105(3) 

.253E+12 

.199E+15 

43. 

•696E-13 

11.868(3) 

.268E+12 

. 195E+14 

*44. 

. 110E-12 

11.458,  13.531,  15.466, 
17.549 

. 886E+1 3 

.170E+14 

45. 

.113E-12 

19.710 

.440E+11 

.555E+13 

46. 

. 118E-12 

17.181(3) 

. 104E+12 

.136E+14 

47. 

.130E-12 

10.448,  12.557,  14.396, 
18.617 

.881E+13 

.567E+14 

48. 

.136E-12 

10.464,  12.526,  14.472, 
16.543 

.934E+13 

.358E+14 

49. 

.138E-12 

8.349 

.107E+12 

.465E+13 

50. 

. 145E-12 

9.417,  13.584,  15.444, 
17.560 

.857E+13 

.405E+14 

51. 

.150E-12 

10.816(3) 

.131E+12 

.818E+13 

52. 

.175E-12 

10.462,  12.544,  15.440, 
17.554 

.822E+13 

.225E+15 

53. 

.299E-12 

10.477,  12.501,  14.542, 
17.519 

.102E+14 

.164E+15 

54. 

.321E-12 

9.4*3,  12.309,  14.708, 
15.777 

.123E+14 

.270E+15 

55. 

.409E-12 

18.273(3) 

.257E+11 

.438E+13 

56. 

.426E-12 

9.766(3) 

.485E+11 

.268E+13 

57. 

•808E-12 

7.325 

. 191E+11 

.115E+13 

58. 

.160E-11 

8.717(3) 

.135E+11 

.652E+12 

Table  X.3  (continued) 


0 ql 

C's 

Worst  condition  numbers 
Multiple  zero  Simple  zero 

59. 

.285E-11 

19.401(3) 

.283E+10 

.719E+12 

60. 

.652E-11 

6.302 

. 247E+10 

.230E+12 

61. 

.808E-11 

7.669(3) 

.281E+9 

.116E+12 

62. 

.561E-10 

6.620(3) 

.424E+9 

.200E+11 

63. 

.751E-10 

5.277 

•224E+9 

.384E+11 

64. 

. 555E-9 

5.570(3) 

. 450E+8 

.362E+10 

65. 

.131E-8 

4.252 

. 1 35E+8 

.633E+10 

66. 

.822E-8 

4.519(3) 

.320E+7 

. 609E+9 

67. 

. 381 E-7 

3.226 

.493E+6 

.975E+9 

68. 

.198E-6 

3.464(3) 

.142E+6 

. 982E+8 

69. 

.213E-5 

2.196 

. 950E+4 

.162E+9 

70. 

.888E-5 

2.404(3) 

. 345E+4 

. 1 57E+8 

71. 

. 320E-3 

1.160 

.720E+2 

. 389E+8 

72. 

.976E-3 

1.331(3) 

.366E+2 

.426E+7 

73. 

1.414 

o 

• 

o 

.615E+1 

. 1 65E+1 1 

74. 

1.732 

0.0(3) 

.286E+1 

. 1 20E+1 0 

75. 

2.19614 

-117.314 

76. 

2.73772 

- 9.579 
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Table 

Interesting  Polynomials  Near 


X.4 


Wilkinson's,  Uniform  Norm 


Vs 


* 1. 

. 1 35E- 9 

15.487 

2. 

. 142E-9 

16.524 

3. 

. 183E-9 

14.452 

4. 

.223E-9 

17.567 

5. 

•350E-9 

13.419 

6. 

.570E-9 

18.613 

7. 

•936E-9 

12.388 

8. 

. 294E-8 

19.691 

9. 

. 352E-8 

11.358 

*10. 

.477E-8 

14.465,  16.537 

11. 

. 723E-8 

13.431,  17.578 

12. 

. 114E-7 

15.449,  17.550 

13. 

. 135E-7 

12.397,  18.625 

14. 

. 159E-7 

11.361,  19.692 

15. 

.190E-7 

10.329 

16. 

•215E-7 

13.454,  15.557 

17. 

•247E-7 

14.387,  18.607 

18. 

.338E-7 

14.547,  17.515 

19. 

.359E-7 

13.478,  16.422 

20. 

.389E-7 

12.414.  . j28 

21. 

•446E-7 

13.493,  18.597 

22. 

•489E-7 

12.421,  17.490 

23. 

.602E-7 

16.341,  18.589 

24. 

•633E-7 

12.431,  14.643 

►25. 

. 105E-6 

16.025(3) 

26. 

. 10.  E-  S 

14.948(3) 

27. 

.151E-6 

9.300 

28. 

.156E-6 

13.877(3) 

29. 

.159E-6 

17.112(3) 

30. 

. 326E-6 

12.811(3) 

Worst  condition  numbers 
Multiple  zero  Simple  zero 


.325E+7 
. 271E+7 
.268E+7 
. 148E+7 
. 156E+7 
•471E+6 
.645E+6 
. 664E+5 
. 189E+6 
.271E+7 
• 159E+7 
.327E+7 

. 1 90E+6 
. 384E+5 
.333E+7 
.261E+7 
. 299E+7 
. 306E+7 
.262E+7 
•177E+7 
.172E+7 


.835E+4 
. 944E+4 
. 529E+4 
. 724E+4 
.477E+4 
385E+4 


.241E+10 

. 152E+10 

.295E+10 

.281E+10 

.186E+10 

.120E+10 

.624E+9 

.417E+9 

•126E+9 

.106E+9 

.856E+9 

.137E+10 

.117E+10 

. 121E+9 
.219E+8 
.165E+10 
. 91 1 E+9 
.186E+10 
• 177E+10 
.955E-!9 
. 1 r/9E+l  0 
. I67E+10 


.440E+9 
. 342E+9 
.319E+7 
.212E+9 
.214E+9 
.901E+8 
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Table  X.4  (continued) 


Worst  condition  numbers 
Multiple  zero  Simple  ze>'n 


31. 

. 398E-6 

1S:216(3) 

. 158E+4 

.445E+8 

32. 

. 972E-6 

11.747(3) 

. 143E+4 

. 262E+8 

*33. 

. 120E-5 

13.451,  16.372,  18.585 

. 326E+7 

. 198E+10 

34. 

. 127E-5 

12.442,  14.592,  17.539 

.312E+7 

.197E+10 

35. 

.186E-5 

8.271 

36. 

.192E-5 

13.485,  15.496,  17.528 

.3Q9E+7 

.245E+9 

37. 

. 206E-5 

19.358(3) 

.226E+3 

. 251 E+8 

38. 

.208E-5 

11.396,  15.820,  18.596 

. 520E+7 

.41 8E+10 

39. 

. 221E-5 

13.416,  15.638,  18.574 

.385E+7 

. 193E+10 

40. 

. 221E-5 

11.397,  16.273,  18.598 

.481E+7 

. 376E+10 

41. 

.239E-5 

12.461,  15.384,  17.579 

.364E+7 

. 128E+10 

42. 

. 301E-5 

12.457,  14.533,  16.469 

.270E+7 

.107E+10 

43. 

. 337E-5 

12.460,  14.522,  18.581 

44. 

. 379E-5 

12.468,  16.428,  18.575 

45. 

.41 7E-5 

10.685(3) 

.368E+3 

.513E+7 

46. 

.440E-5 

14.563,  16.439,  18.573 

47. 

.496E-5 

12.492,  15.530,  18.558 

48. 

. 502E-5 

11.411,  15.564,  17.492 

49. 

. 531 E-5 

11.420,  14.116,  18.648 

50. 

.746E-5 

11.428,  14.255,  16.742 

51. 

. 768E-5 

11.429,  14.269,  17.335 

52. 

.896E-5 

11.431,  13.597,  15.311 

53. 

.948E-5 

13.605,  15.299,  19.668 

54. 

. 265E-4 

9.625(3) 

.637E+2 

.657E+6 

55. 

. 309E-4 

16.019,  17.181,  19.646 

56. 

. 378E-4 

7.242 

.255E+2 

.252E+5 

*57. 

.520E-4 

12.447,  14.547,  16.447, 
18.570 

.274E+7 

. 382E+8 

58. 

.252E-3 

11.446,  13.567,  16.444, 
18.564 

59. 

. 259E-3 

8.565(3) 

.718E+1 

.526E+5 

60. 

. 327E-3 

11.455,  13.528,  15.512, 

18.537 
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Table  X.4  (continued) 


Iql 

C's 

Worst  condition  numbers 
Multiple  zero  Simple  zero 

61. 

. 340E-3 

11.468,  14.434,  16.495, 
18.556 

62. 

. 728E-3 

12.580,  14.355,  17.039, 
19.645 

63. 

. 141E-2 

6.213 

.757 

.132E+4 

64. 

.416E-2 

7.504(3) 

.495 

.361E+4 

65. 

.113 

5.183 

.106E-1 

.397E+2 

66. 

.120 

6.444(3) 

.191E-1 

.1 98E+3 

67. 

7.24 

5.382(3) 

. 355E-3 

.970E+1 

68. 

25.3 

4.152 

. 539E-4 

. 193E+1 

69. 

1161.4 

4.318(3) 

.254E-5 

.474E+0 

70. 

.256E+5 

3.120 

71. 

.770E+6 

3.251(3) 

72. 

.331E+9 

2.085 

73. 

.525E+10 

2.181(3) 

74. 

.679E+15 

1.062 

75. 

.144E+16 

1.140(3) 

4 


268 


5.  Discussion  of  Results 

Apparently  Wilkinson's  polynomial  lies  near  a thicket  of  inter- 
secting branches  of  the  manifold  of  polynomials  v/ith  a double  zero; 
see  Figure  X.5.  Although  there  is  a unique  point  on  this  manifold 
closest  to  p,  there  are  other  locally  closest  points  in  different 
directions  that  are  not  much  further  away.  In  turn  the  self-inter- 
sections of  the  manifold,  which  form  the  manifold  of  polynomials  with 
two  double  zeros,  may  be  found  not  much  further  from  p than  the 
first  manifold.  And  by  steps  that  are  increasingly  larger,  but  not 
overwhelmingly  so,  it  is  possible  to  obtain  3 or  4 double  zeros  or  a 
triple  zero. 

Perhaps  the  polynomials  whose  zeros  are  the  integers  from  1 to  n 
form  a family  akin  to  the  finite  segments  of  the  Hilbert  matrix  [11]. 
These  ill  conditioned  matrices  have  the  property  that  there  is  no 
obvious  perturbation  to  a matrix  of  lower  rank  that  results  in  a per- 
turbed matrix  of  satisfactory  condition.  For  large  n,  rather,  there 
is  a sequence  of  possible  perturbations  to  nearest  matrices  of  rank 
n-1,  n-2,  etc.  Each  perturbation  in  this  sequence  has  the  property 
that  it  is  neither  much  larger  than  the  previous  perturbation  nor  much 
smaller  than  the  next.  Furthermore  the  corresponding  sequence  of 
nearest  matrices  of  rank  n-1,  n-2,  etc.  has  the  property  that  each 
matrix  is  somewhat  better  conditioned  than  the  previous  but  somewhat 
less  well  conditioned  than  the  next  one.  Thus  the  ill  condition  of  a 
Hilbert  segment  can  not  be  satisfactorily  "explained"  as  due  to  a 
small  perturbation  of  a well  conditioned  matrix  of  lower  rank. 

If  an  analogy  with  the  Hilbert  segments  is  appropriate,  then 
Wilkinson's  polynomial  can  not  be  satisfactorily  "explained"  by  means 
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of  the  numerical  methods  described  in  previous  chapters.  A satis- 
factory explanation  would  entail  an  understanding  and  description  of 
the  geometry  of  the  manifolds  of  polynomials  with  multiple  zeros  and 
their  intersections. 
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6.  Numerical  Results  for  Translation 

Here  we  summarize  some  results  for  translating  Wilkinson's  poly- 
nomial. The  zeros  of  the  translated  polynomial  are  ±0.5,±1 .5,. . . ,±9.5. 
In  the  uniform  norm  the  worst  conditioned  of  these  are  ±8.5  with  con- 
dition numbers  of  72,  which  are  well  enough  conditioned  for  most  pur- 
poses. In  contrast  the  condition  numbers  for  ±0.5  are  .877E-12. 

The  nearest  polynomial  with  a double  zero  had  c = ±7.979  and 
flqfl  = .437E-2.  Only  the  condition  of  the  coalescing  zeros  was  improved 
significantly,  to  .402. 

Thus  in  this  norm  the  effects  of  translation  go  much  farther 
toward  "amelioration"  of  ill  condition  than  do  any  of  the  movements 
to  manifolds  of  multiple  zeros. 

When  the  translation  to  an  even  polynomial  is  carried  out,  some 
of  the  coefficients  in  the  translated  polynomial  vanish.  Thus,  in  the 
norm  that  measures  relative  changes  in  coefficients,  some  of  the 
weights  become  infinite.  Some  of  the  computer  codes  do  not  handle 
this  case  properly  so  only  partial  results  are  available. 

The  worst  zeros  now  are  ±7.5  with  condition  numbers  .127E+5.  The 
nearest  polynomial  with  a double  zero  appears  to  be  a polynomial  with 
two  double  zeros  at  ±6.979.  Two  double  zeros  are  to  be  expected 
since  the  infinite  weights  constrain  the  perturbation  to  be  even.  In 
contrast,  only  one  double  zero  was  obtained  for  the  uniform  weights. 

Numerical  difficulties  prevented  accurate  determination  of  |q|. 
The  difficulties  arose  from  the  fact  that  the  code  expected  only  one 
double  zero  so  that  the  other  one  was  poorly  determined  as  two  single 
zeros.  The  codes  for  two  double  zeros  found  ; = ±8.201  with 


(iqfl  = .247E-4  but  they  seem  to  have  missed  the  polynomial  with 
C = ±6.979. 
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7.  Zeros  in  Geometrical  Progression 

In  [34]  Wilkinson  also  discussed  the  polynomial  of  degree  20 

-1  -2  ?n 

whose  zeros  are  in  the  geometrical  series  2 ,2  ,..,,2  . From  one 

point  of  view  these  zeros  are  all  remarkably  well  conditioned  despite 
their  apparent  crowding  near  zero.  Thus  just  as  the  first  polynomial 
was  ill  conditioned  yet  free  from  clustering  in  its  zeros,  this  second 
polynomial  seems  well  conditioned  despite  what  seems  to  be  extreme 
clustering. 

For  this  polynomial,  however,  all  depends  on  the  point  of  view. 
Whereas  the  first  polynomial  was  ill  conditioned  whether  uniform  or 
relative  perturbations  were  considered,  the  second  is  only  well  condi- 
tioned when  relative  perturbations  are  at  stake. 

When  relative  changes  both  in  the  coefficients  and  the  zeros  are 

considered,  the  worst  zero  is  2"11  and  its  condition  number  is  65.7; 

the  other  condition  numbers  are  remarkably  similar,  the  best  being 

8.43.  In  contrast,  when  absolute  changes  in  the  coefficients  and  zeros 

-19 

are  at  issue,  the  worst  is  2 with  a condition  number  of  .109E+59; 
the  best  is  2"1  with  condition  number  .210E+7.  In  the  uniform  norm, 
then,  this  polynomial  is  far  worse  conditioned  than  the  better  known 
one  with  a linear  distribution  of  zeros. 

It  should  be  realized  that  the  coefficients  of  p range  from  1 
to  1E63  in  magnitude.  With  such  a wide  range  of  magnitudes  of  both 
coefficients  and  zeros,  numerical  problems  made  it  difficult  to  obtain 
meaningful  results.  What  results  were  obtained  frequently  failed  some 
of  the  tests  described  in  chapter  VIII.  Since  floating  point  underflow 
is  not  detected  by  the  CDC  6400  and  overflow  is  known  to  have  occurred, 
we  will  not  discuss  these  possibly  contaminated  results. 
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CHAPTER  XI 
CONCLUDING  REMARKS 

We  have  given  methods  for  finding  nearby  polynomials  with  various 
configurations  of  multiple  zeros.  We  have  exhibited  examples  to  show 
that  these  methods  provide  the  answer  we  would  expect  when  the  correct 
answer  is  obvious. 

For  a polynomial  like  Wilkinson's,  however,  there  is  no  obvious 
answer,  and  these  methods  do  not  provide  satisfactory  explanations  of 
the  ill  condition  of  such  polynomials.  Rather  the  numerical  results 
provide  evidence  of  an  inherently  complicated  structure  of  the  mani- 
fold of  polynomials  with  multiple  zeros. 

Finally  there  are  intermediate  polynomials  for  which  the  "correct" 
answer  is  no  longer  so  obvious  but  which  do  not  seem  to  present  so  con- 
fusing a picture  as  Wilkinson's  polynomial.  For  such  intermediate 
cases  our  methods  sometimes  provide  results  that  seem  satisfactory  and 
sometimes  do  not.  But  it  is  not  yet  clear  whether  "unsatisfactory" 
results  are  due  to  defects  in  algorithms  or  inappropriate  expectations 
about  the  existence  of  satisfactory  nearby  polynomials. 

In  each  of  these  areas  there  is  ample  scope  for  further  research. 

For  the  "obvious"  cases  we  would  like  to  be  able  to  specify  starting 
points  for  iterative  methods  which  could  be  guaranteed  to  converge 
quickly  to  the  global  minimum. 

For  the  intermediate  cases  we  would  like  to  know  simple  criteria 
for  deciding  when,  for  instance,  nearby  polynomials  with  complex  con- 
jugate pairs  of  double  zeros  may  exist.  More  generally  we  would  like 
to  know  when  a solution  c of  the  equations  we  wish  to  solve  does  not 
exist  in  a particular  region,  so  that  we  need  not  waste  time  looking  there. 
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Sketchy  information  on  where  to  look  for  c is  known  for  the 
case  of  one  double  zero,  but  for  other  configurations  the  only  known 
facts  are  that  the  dimensionality  of  the  problem  is  less  than  might 
have  been  thought,  because  certain  Lagrange  multipliers  vanish  in  the 
complex  case.  We  would  like  to  have  a simple  criterion  in  the  real 
case,  that  will  tell  us  when  we  may  rely  on  that  theorem  about  Lagrange 
multipliers,  when  we  must  check  real  configurations  of  higher  multi- 
plicity, and  when  we  must  check  for  complex  conjugate  multiple  zeros. 

The  new  expansion  technique  discussed  in  chapter  VII  provides 
some  interesting  questions.  In  how  large  a region  can  realistic 
bounds  be  computed  easily?  It  would  be  desirable  to  have  a symbolic 
algebra  program  to  provide  these  tedious  bounds  automatically.  Do 
these  bounds  have  any  significant  advantages  over  Smith's  [42]? 

A task  of  a different  sort  is  to  render  the  existing  mass  of 
algorithmic  ideas  and  devices  into  mathematical  software.  The  com- 
puter codes  with  which  the  research  reported  here  was  conducted  were 
constantly  changing  and  required  considerable  experience  to  direct  the 
search  and  interpret  the  results.  They  were  dependent  on  the  local 
computing  environment  in  many  ways  and  most  likely  contain  some  errors, 
which  would  probably  not  affect  the  results  presented  in  previous 
chapters. 

In  contrast,  respectable  mathematical  software  is  carefully 
specified,  written,  documented,  and  tested.  Then  it  is  independently 
examined  and  tested  again.  The  experienced  computer  programmer  now 
recognizes,  moreover,  that  the  production  of  quality  mathematical  soft- 
ware  from  its  raw  materials  entails  as  much  effort  a*  providing  those 
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raw  materials.  Consequently  that  production  will  be  deferred  to  another 
occasion  in  this  case. 

The  final,  and  perhaps  most  difficult,  challenge  is  to  unravel 
the  nature  of  the  manifold  of  polynomials  with  multiple  zeros,  partic- 
ularly in  the  vicinity  of  polynomials  like  Wilkinson's.  Although 
numerical  investigations  may  sometimes  be  helpful,  probably  the  prin- 
cipal factor  for  success  will  be  the  investigator's  competence  in 
algebraic  geometry. 

Turning  now  to  a more  general  point  of  view,  we  should  recall  that 
one  reason  for  studying  polynomials  is  that  they  are  simpler  than  the 
often  infinite  dimensional  eigenvalue  problems  they  frequently  replace. 
Thus  the  more  general  problem  might  be  stated  as  follows:  given  a 
linear  operator,  some  of  whose  eigenvalues  are  ill  conditioned,  what 
is  the  nearest  linear  operator  whose  eigenvalues,  some  of  them  multiple, 
are  all  well  conditioned? 

Ruhe  [27],  Wilkinson  [36],  and  Kahan  [16]  have  all  given  bounds 
for  the  distance  to  the  nearest  matrix  with  a multiple  eigenvalue. 

Kahan  [17]  and  Golub  and  Wilkinson  [39]  have  also  surveyed  the  known 
theory.  But  there  are  no  known  computational  techniques  which  are 
even  as  reliable  as  those  discussed  previously  for  zeros  of  polynomials. 
The  closest  related  work  is  that  of  Kagstrom  and  Ruhe  [15]  on  finding 
the  Jordan  canonical  form  of  a matrix.  Otherwise  the  many  refractory 
aspects  of  the  problem  remain  untouched  for  future  investigators. 
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1 . Using  the  Zeros  of  a Polynomial  to  Compute  Its  Coefficients 

Our  object  is  to  display  the  well  known  algorithm  for  computing 

the  coefficients  of  a monic  polynomial  from  its  zeros.  If  we  are  to 

determine  the  p.  in 
3 

n _ n 

n (t-?.)  * Tn  + l p.T 3 
j=l  J j=l  J 


and  we  expand  directly  we  find 

p.  = l [n  (of  the  (-?.)'s  in  each  combination)] 

J M . J 

over  all  (V)  combinations 
of  the  n (-£.)' s taken  j 

J 

at  a time 

We  can  avoid  this  n!  calculation  by  building  up  the  coefficients 
recursively.  If  we  have  a polynomial 


pk(-0  * l pV-J  « n (t-c.)  . pj  * 1 

j*0  J j-1  J U 


we  can  form  the  polynomial  of  degree  k+1  by  multiplication  by  (T-Ck+1): 


Pk+1(T)  - ( l PjTk'j)  (i-;k 
j*0  J 

!>  k k+l-j  r _k/  _ % k-j 

* I P^  - I Pi^-^k+l )t 
j«0  J j*0  J K 1 

j=0  J 
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where 


■r 


j » k+l  , 

J = k,k-l .... ,2,1  , 

j « 0 . 


We  list  the  coefficients  in  the  order  that  they  could  be  successively 
computed  and  overlaid  in  storage. 

In  the  case  of  real  polynomials,  we  wish  to  avoid  complex  arith- 
metic by  considering  complex  zeros  and  their  conjugates  together. 

Then 

Pk+2(t)  * pk(x)-(x2-2(Re  Ck+i)-r+  |ck+i  |2) 
so 


k+2 


vAk 


"2^Re  ;k+l^pk+  l;k+l  1 2pk-l 
pj  ‘ 2(Re  ck+l)pj-1  + I ck+l !2pj-2  * 

pr 2(Re  W 


j * k+2  , 
j * k+l  , 

J B k, . . . ,3,2  , 
j'1, 
j » 0 . 


It  may  happen  that  we  are  only  interested  in  the  last  few  coeffi- 
cients or  the  first  few.  The  formulas  above  may  be  used  for  the  first 
few  coefficients  corresponding  to  high  powers  of  t. 

To  find  formulas  for  the  last  few  coefficients,  corresponding  to 
low  powers  of  x,  we  redefine  the  pj  as  follows: 

pk(T)  * l pW  . 

jB0  3 


W^fcV!»^Vir«WSWSWlTI*i  F 


Then 
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J = 


>1  - J nk  -r  Dk  i 

pj-l  ;k+lpj  * J 

r r»k  4 

-ck+lpO  * 3 


k+1  , 

k,... ,2,1  ) 
0 , 


and  in  the  case  = Ck+-|> 


1 


k+2  - J 
PJ 


Pk-1  * 2 ^ ?k+l 

pj-2  ‘ 2 Re  ;k+lpj-l  + ' ;k+l ' pj 
-2  Reck+1PQ+  Uk+1 1 p<| 

. ,2  k 

I ^k+1 ^ p0 


J = 


, J 

. j 
» j 
, j 


k+2  , 
k+1  , 
k 2 
1 , 

0 . 
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2.  Simultaneous  Evaluation  of  a Polynomial  and  Some  of  its  Derivatives 

Ways  of  efficiently  evaluating  a polynomial  and  its  derivatives 
simultaneously  from  the  coefficients  have  been  studied  by  Shaw  and 
Traub  [29]  among  others. 

Rice  [26]  has  argued  that,  given  the  zeros  ?.  of  a polynomial, 

J 

computing  the  product 

n 

p(t)  - n (t-c() 
j=i  J 

is  usually  the  method  of  evaluation  that  minimizes  the  uncertainty  in 
p(-r).  When  the  polynomial  is  evaluated  in  this  way  the  relative  error 
in  the  final  result,  due  to  rounding  errors,  is  always  small  on  a 
properly  designed  machine.  In  contrast  the  relative  error  of  the 
evaluation  from  the  coefficients  is  usually  large  when  t is  near  one 
of  the  g.. 

J 

Furthermore  if  the  zeros  are  the  primary  data,  rather  than  the 
coefficients,  the  attempt  to  compute  the  coefficients  from  the  zeros 
will,  in  the  presence  of  rounding  errors,  produce  wrong  coefficients 
which  will  be  the  coefficients  of  another  polynomial  with  different 
zeros.  If  the  new  zeros  are  ill  conditioned  they  may  be  rather  far 
removed  from  the  zeros  we  started  with. 

Therefore  we  prefer  to  evaluate  polynomials  and  their  derivatives 
directly  from  the  zeros  if  they  are  the  primary  data.  Typical  expres- 
sions for  the  polynomial  and  two  derivatives  follow: 
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\ 


i 


« 


Method  N: 


p(t) 


p'(t) 

W 

P"(t) 

W 


n 

n (t-C.) 
j-1  J 

n i 

y -1- 
d-i  t-'j 


n 


a ^)2 

j-i  t cj 


n 

I 


1 


j=l  (t-Cj)4 


Similar  expressions  for  higher  derivatives  may  be  found  by  means  of 
Newton’s  identities  which  are  described  in  Householder  [12].  These 
expressions  have  the  defect,  however,  that  in  the  presence  of  rounding 
errors,  they  tend  to  have  high  relative  errors  which  are  revealed  by 
cancellation  at  the  end.  Thus  if  t i ?.  in  the  expression  for  p"/p, 

J 

the  two  subexpressions  will  tend  to  cancel  with  subsequent  severe  loss 
of  significant  figures.  By  algebraic  manipulation  we  nay  be  able  to 
find  forms  for  these  expressions  in  which  cancellation  is  not  pre- 
ordained. For  instance 


P k*j+l  T ck 

but  this  expression  is  not  applicable  when  ^ exactly. 

Therefore  it  is  helpful  to  use  different  methods  for  computing  a 
polynomial  and  its  derivatives  from  its  zeros.  These  methods  are 
based  on  the  observation  that  if 


P(t)  * n (t-c.)  = l p.Tn"J  , 

j*l  J j*0  J 


P0  = 1 » 
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then  P(0)  « pn,  P'(0)sPn_r  and  in  general,  pk(0)  = k'pn 
Therefore  we  can  evaluate  the  polynomial  and  m derivatives  at  0 by 
computing  the  last  m+1  coefficients  of  p from  its  zeros 

Moreover  we  can  evaluate  p and  its  derivatives  at  a by  com- 
puting the  coefficients  of  the  polynomial  whose  zeros  are  c -a: 

J 

Method  A; 

(k) 

P (a)  * k I (n-k  coefficient  of  polynomial  whose  zeros  are  c.-a} 

j 

Another  method  is  based  on  the  observation  that 


q(t) 


Then 


n n i 

n (-c<)  n (t-JL) 
M J J-i  cj 


qk  * Pn-k 


th 

* coefficient  of  polynomial  whose 

- P(k){0)/ki  . 


zeros  are 


So  continuing  as  before. 

Method  B: 

(k) 

P (a)  • k!p(a){k  coefficient  of  polynomial  whose  zeros  are  1 .) 

cj‘a 

Like  Newton's  Identities,  however,  this  method  Is  undefined  11  ■■  c . 

We  might  conduct  operation  counts  to  help  choose  from  among  these 
methods.  They  all  require  mn*0(m2)  ♦0{n)  operations  to  evaluate  a 

polynomial  and  m derivatives.  Therefore  we  choose  Method  A since  It 
is  applicable  even  when  t « 
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3.  Partial  Derivatives  of  a Deflated  Function  of  a Complex  Variable 
When  minimizing  norms  of  functions  of  complex  variables  we  are 
often  required  to  find  zeros  of  non-analytic  functions  of  a complex 
variable.  There  seems  to  be  little  general  theory  for  such  functions 
other  than  that  of  two  real  analytic  functions  of  two  real  variables. 
Consequently  when  finding  zeros  of  such  functions  by  Newton's  method 
we  solve  systems  of  two  equations. 

Having  found  one  solution  we  may  wish  to  deflate  it  out  in  order 
to  find  other  solutions.  Fortunately  there  is  a way  of  deflating  such 
functions  that  makes  sense.  In  contrast,  there  is  no  completely  satis- 
factory way  of  deflating  solutions  of  systems  of  n real  equations  in 
n variables  for  n >_  2. 

f (t)  will  be  the  function  to  be  deflated;  it  is  not  analytic. 

Let  ck  be  the  zeros  to  be  removed;  we  will  divide  f(r)  by 

the  polynomial 

k 

p(x)  ■ n (t-;. ) . 
j*l  K 

The  deflated  function  g(x)  * f(T)/p(x)  is  not  analytic,  but  the 
analyticity  of  p will  simplify  the  expressions  for  the  partial  deri- 
vatives of  Re  g and  Im  g. 

Let  (*)  represent  a differential  operator,  either  _ 0r 

F&T-  Then 

(Re*  g)  * (Re  £)  * (RefRe(^)  - Imf  Im(^)] 

* ((Re  f)Re(^)  + Ref  (Re  £)  - Imf  (Im(£))  - (In  f)Im(l)] 


and 
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■~Be  9 = pe(l)j_Re  f Tm/1  \9  Im  f . ,fp' 

3 Ret  eV9Rer  Im'p'aRei  " Re(~7“)  » 

P 

where  p*  represents  the  complex  derivative  of  p,  Ml l similarly 


and 


and 


■a.  ge  9 = Pr/K8Ref  T / 1 \ 3 Imf  . . ,fp\ 

9 I m t Re(p  VlmT  " Im'pVlmT  Im(  2 ) » 

P 

liM-  pe/N9Inif  1 wVRef  T /fp\ 

3 Re  t e'p^3Rei  + ImV3Rei  ’ 2~)  » 

P 

i~jm9  = pe/l\8_.In»f  ■ Tm/L9Ref  _ ,fp' 

3 Imx  eV9Imi  + lR1vp^imi  " Re(~7-)  • 


These  partial  derivatives  a.re  now  in  a form  suitable  for  use  in  Newton's 
method  applied  to  a system  of  two  real  equations  in  two  unknowns. 
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4.  Computing  the  Divided  Differences  Retired  in  the  Equations 
to  be  Solved  for  Complex  Conjugate  Double  Zeros  in  Chapter  IV 
Below  will  be  found  the  recurrences  required  to  compute 


4p • Ap i 9 and  Ak  > 

the  divided  differences  of  section  IV. 3.  VJe  will  also  obtain  deriva- 
tives with  respect  to  Re  z,  and  Im  z,  for  use  with  Newton's  method. 


Ak  h (Im  c )/(Im  c) 


so  Ag  = 0,  A,  * 1,  and 


Ak  * Re  5 A|(_1  + ReU 


k-l> 


If  we  write 


and 


we  find: 


3A. 

J = L 

\ 8 Re  c 


\n 


8A. 


Ag  » 0 , 


k-1 


Ak  « Re;A|c_1  + ReU  , 


,m  .m  .m  n 

*0  * *1  *2  0 . 


tk  ■ ReU^.,  - (k-1)Im(ilt'2)  , 

*5  ■ *ir  ■ o • 

t[  ■ tk.,  + Re? if,,  + !k-l)Re(ck'2)  . 


In  order  to  compute  Ap  and  A?p,  it  is  necessary  to  start  by 
recalling  the  formulas  for  updating  p ard  p'.  If  the  zeros  of  p 
are  4^,  1 < i < n,  then  we  could  define 
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k 

Pk  = n (c-c.)  . 

K i=l  1 


Then  we  may  imagine  updating  pfc  by  one  real  ten  lf  or  5y  tw0 
plex  conjugate  zeros  c+  and  { Then 

p0  ' 1 • Pk+i  = (W+)Pk  > 

•W  - <C2-2(Re  C+)ct|c+|2}P|<  , 

po  * 0 • Pk+i  * * pk . 

pk+2  = {C2-2(Re  C+k+|cJ2}p£  + 2(;-Re  c+)pk  , 
PC  * 0 * Pk+i  = (t-C+)pJ  + 2p‘  , 

Pj*2  = k2  - 2(Re  C+)t+  |c+|2)pJ+? 

+ 4(C-Re  C+)p*  + 2pk  . 

The  formulas  for  computing  Ap  and  its  derivatives  are  as  follows: 
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Note  in  passing  that 


Im  pk  Im  pj^ 


3 Re  c Im  ; " ImT 
We  now  state  the  corresponding  formulas  required  for  A 


Im  ; 

3 

Im  CP' 

3 Re  c Im  ; 

3 

Im  ;p' 

3 Im  ; Im  ; 

3 

Im  Pk+, 

3 Re; 

Im  ; 

32 

Im  Pk+, 

(3  Re;)2 

Im  ; 

32 

Im  Pk+, 

3 Im  ; 3 Re  ; 

Im  ; 

3 

I"  Pk*2 

3 Re  ; 

Im  ; 

32 

!m  pk+2 

(a  Re  c)z 

Im  ; 

a2 

Im  Pk+2 

dim;  3 Re  ; 

Im  ; 

a Re  p*  + Re  ; 


Im 


Im  ; 


CP'  * 


lRe'1'  dh'wf) 

Im 


ORec) 

-Im  px 


rtH>. 


Imp. 


Im  p. 


le/r.r  )-  9 Kk 

U C+J3Re;  TiT  ’ 

Im  Pl  Im  p 

? Im  C 

\ v « ( 

Imp 


(3  Re;)' 

,2 


Im  p. 


3-  *>..  Kk 

iT;  3 Re  c Im  c 


Im  p 


Im  p. 


Im  p. 


' 4 Re  + 2 ts~+  2 ReU (Re  pk  * 2 mi  -Tirr1 


Im  p. 


♦ ReUc  - 2(Re  ;+);  + UJ*)-  » .-JL, 

(3Rec)21irr 

-« Im  + 2 Re(c-;+)  (-Im  pj  +T^.  i^A) 


5.  Computing  the  Divided  Differences  Required  in  the  Equations 
to  be  Solved  for  Two  Real  Double  Zeros  in  Chapter  V 
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The  equations  which  follow  provide  recurrent  methods  for  computing 
the  divided  differences  required  to  solve  (6.6)  of  chapter  V. 

P.ecal  1 


Therefore  Ag 


Ak  E (cl“c2,/(cl“c2)  * 

0,  and  A1  s 1.  We  may  verify  that 


Aks 

;lAk-l  + 

3Ak 

3Ak-l 

V 

+ Ak-1 

3Ak 

iAk-l 

3C2“ 

'2  3;2 

' + 4k-l 

The  equations  for 


A . are  more  complicated.  Recall  that 

p,k 


_ cjjpU^-cjfp^) 

Ap.k E 


To  compute  A_  t when  p is  given  in  factored  form,  it  is  necessary 
P*K 

to  fix  k and  develop  A_  recursively  by  considering  each  factor 

P»* 

of  p in  turn.  To  start,  suppose  p = 1;  then  A . ■ -A..  Mow 

p,x  k 

suppose  that  An  t Is  !;nown  and  p Is  to  be  multiplied  by  a linear 
factor  (x-a).  Then  denoting  the  new  divided  difference  by  A 


+1 


P»k* 


A 

+1 


P.k 


UjSpUiK^-a)  - ^pUgKcg-oOVU^) 

;l;2Ap,k-l  ‘ ^p.k  * 


289 


Furthermore 


and 


9 A 
♦1 


P.k 


;2(il 


3A, 


PiX-1 4 
35, 


‘p.k-l 


} - 


3A 


tik 


15 


3 A _ t 
+1  p,k 
35o 


3A  ^ * 

?1{;2  lL~  + Ap,k-1}  ' 


3A 


P.k 


"5E 


If  p were  real  and  a were  complex  it  would  be  desirable  to 
update  p by  the  real  quadratic  factor  (T-o)(i-a).  Let  A 
represent  this  updated  divided  difference: 


+2 


P.k 


C2(c*  -2Reac1+  |a|*)  - C* (t|  * 2 Rea;2+  |a|z) 

♦2  p.k  * 


,kr.2 


.2.2 


9 A 


' Wp.k-2  ’ 2(Re  a>Wp.k-l  + H %,k 

- r c2(2A  + r 3AP»k"2!  ♦ U|2  3AP.k 

5£j  5l52tZAp,k-2  + 51  3?i  } 9;J 

- 2(Re  a)52{APtk-l  +;1  " * 

3 A 

+2 

The  corresponding  equation  for  —■  may  be  found  by  interchanging 

^2 

51  and  ;2> 

Similar  methods  may  be  applied  to 


Ap\k 


4p,(5i)-5iP'(52) 

5 


1 


Note  first  that 
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P+1(t)  * (x-a)p(x)  , 

P^(x)  s (x-a)p'(x)  + p(x)  » 

P^(x)  * (x-ct)p"(x) + 2p*(x)  , 
and 

p+2(x)  “ (x-o)(x-5)p(x)  » 

p;2(x)  s 2(x  - Re  a)p(x)  + (x-a)(x-a)p'(x)  , 

P^2 ( x ) ■ 2p(x)  + 4(x  - Re  a)p* (x)  + (x-a)(x-a)p"(x)  . 

Then 


+Ap\k  " ClC2Ap\k-l  ' “V.k  + S.k  * 


3A 


1P>>k  . , r.  “p'.k-l  tA  > B °V.k  . "p.k 

3^  C2{C1  3?i  Ap\k-  } 3^  3^  * 

+Ap\k  “ (clC2)2V.k-2  ’ 2(R®  °^l52Ap\k-l  + l°l%',k 

+ -l^p.k-1  ' 2<Re  °>Ap.k  • 


3^, 


3^,  . 3A, 


+2P'  »k  ? yAn'  k.? 

-kr  * «1«2teT  — — +2*p-.k-2> 


♦ l«l2  ^ 


* 2(R*  ^ 1 
+ 2‘2<‘l  ' 2(Re  o)Ttf  ' 


These  formulas  may  be  used  to  calculate  k . and  k , , . except 

P*k  P »k 

when  k ■ 0 or  k ■ 1.  In  those  cases  the  formulas  would  require 

k , whi'-h  is  not  defined. 

P»-< 


/ 


To  deal  with  that  difficulty,  different  formulas  for  divided 
differences  must  be  used  for  k » 0 and  k = 1.  These  formulas  will 


be  based  on  the  finite  difference  analog  of  Leibniz'  rule: 


xy(e.)  -xy(e9) 

i(xy)(e,.«2)  s — ^ — 

x(e1 ) + x(e2)  y^)  -y(e2) 

' 1 ! 3 3 e1-e2  > 

>!6,)  + y(e,)  x(e.)-*(6  ) 

+ t— 4 3 ("  e,  - e2  2 J • 


Here  x and  y are  functions  of  a single  variable;  the  divided 
difference  of  the  product  xy  is  sought  for  the  points  (e.|,e2). 

This  and  other  divided  difference  Identities  may  be  found  In  the  book 
by  Milne-Thomson  [23]. 

For  our  application,  x will  be  p(x)  or  p* (t)  and  y will 
be  the  updating  factor  (x-a)  or  (T-a)(r-a).  We  find  that 


U2-a)  P<C-| ) + P(C2) 

f.  n.o  9 Vn  + 9 » 


2 *P,0T F 

PtO  (Ci-tt)  + (c«-o)  3 A_  a i ^ 

sq -i  1 • 

(cra)(cra)  ♦ (c2-o)(;2-o) 


♦1  P-° 

3 A 

+1 


♦A2p.°- 


(- 


*p,o 


pUJ  + pU?) 

♦ ^ (U-J  - Re  a)(c2-Re  a))  , 

■kr  * ( — H — — j-Jj2  * «,  - * «>*,.< 

pUJ  + pUJ  , 

+ — + ^>'(;1)((;1 -Re  o)(;2-Re  a))  , 


-c1+52 


P(^)  + p(C2) 


f2 


3A_ 


. o P » 1 £■»"*"  4 oaln  f)  1 

“ 3^  = (~2  2kea)?1?,2  3?*  + S2(;l  +2^2  “ 2 Rea)Ap,( 


’1 


3A 


+ l“!2'3^itK(?lP'(cl)+pUl))  ' 

Similarly 

(t,-a)  + (s.-a)  P'UJ  + P'U  2) 

* p',0  * (-’—■ y- Z-  -)V.O  + (—4 -)  + *p,0  • 

8<t1P''°  = ,(Va>^;-°yV,Q  1 3^0 

9C-]  1 2 J 3c1  rp'.o  F uv  + a;  » 

p'U^  + p'UJ 


+2P'.0 


- £ g ){(;.|-Re  a)  + (?2-Rea)} 


(.(C1-o)^1-a)  + (i;2-a)(;2-a)^ 

+ ( 2 JV.o 


+ [U}  - Re  a)  + (;2  - Re  a))Ap>0  + P^ ) + p(c2)  , 
a+9P'.0  (c, -Re  a)  + (c2-Re  a)  P*  (c, ) + p' (c9) 

-np*  H H VV* — 4 — - 

, % (C1-o)(C1*a) + (c?-a)(c«-a)  3An,  n 

+ "i  - Re  »>*»■  ,o + (-J — — r1 — M-fe2 


9A. 


+ ((^  - Re  a)  + {;2  - Re  a))-^  + Ap>0  + p'  (^ ) . 


Finally 


P'UJ  + P'UJ  ,4+Co 

*2p'.l  * W 4 -)  + !-V-2Re^^,Vp.,o 

+ l“l2lp'.>  + Ztl?2*P,0  " 2^Re  “**p,l  > 
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,i  = (-V-2Rp“Hi^P,o + + cicgt— '4 — ) • 
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\ 


* 


3 A , , 
+2  p ’ 


9C 


1 


1 , X .P'UJ  + P'Uo)  3AD  n 

= ^1C2P"(C1)  + lz{— —£ —)  - 2(Re  a)-^ 


1 h+b 

+ + (-^-2  Rea)Hplj0 


’2  2 1 

£ 

2 


3A  i n 

+ ( 5 2 Re  a);^;2  3^’ 


Ha| 2^l+25,(Cl  ft>. 


1 


■2^1  *1; 


1 


P,0J 


Taken  together,  the  foregoing  equations  provide  all  the  divided 
differences  required  in  chapter  V.  To  inhibit  convergence  to  the 
remaining  unwanted  solutions  it  is  still  necessary  to  use  the  defla- 
tion techniques  of  section  5 of  that  chapter. 
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6*  The  Lagrange  Multiplier  Thpnrpn 

The  following  corollary  of  the  Fredholm  Alternative  Theorem 
provides  the  basis  for  the  use  of  Lagrange  multipliers  to  find 
stationary  points  of  functions  subject  to  constraints.  The  vector 
is  the  vector  of  Lagrange  multipliers. 

liLeorem-  Let  B map  Cn  to  C*.  Then 

(for  every  x e Cn,  Bx  = 0 =s>y*x  = 0) 

if  and  only  if  there  exists  an  i*  e Crn  such  that 

y*  * £*B  . 

s«  Dunford  and  Schwartz  [9,  p.  609)  for  a statement  of  the  Fredholm 

Alternative  Theorem  In  an  arbitrary  Banach  space,  and  for  references 
to  a proof. 
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esting  properties  of  the  systems  can  be  determined  fror  the  zeros  of  the  polynomials. 
Standard  codes  compute  those  zeros  from  the  coefficients  in  a stable  fashion.  But 
what  should  be  done  if  the  zeros  are  inherently  hypersensitive  to  changes  in  the 
coefficients  of  their  polynomials?  Newly  developed  methods  can  be  used  to  explain 
such  an  ill  conditioned  polynomial  by  exhibiting  a nearby  polynomial  with  one  or  more 
multiple  zeros  which  are  well  conditioned.  Furthermore  these  methods  can  be  abused  by 
uncritically  replacing  the  ill  conditioned  polynomial  with  the  well  conditioned  one 
nearby.  When  such  a replacement  is  unwarranted,  bounds  can  be  obtained  on  the  varia- 
tion of  the  zeros  corresponding  to  the  uncertainty  in  the  coefficients.  One  way  to 
obtain  such  bounds  is  to  exploit  the  nearby  well  conditioned  polynomial  to  obtain  a 
revision  of  the  classical  Puiseux  fractional  power  series  expansions  of  the  zeros. 

These  notions  have  been  investigated  experimental ly  in  a long  series  of  computer 
calculations.  In  the  course  of  these  calculation;  the  existing  stock  of  numerical 
techniques  has  been  augmented.  A new  way  is  now  known  for  computing  the  condition 
numbers  which  neasjre  the  condition  of  zeros.  The  previously  known  equations  to  be 
solved  for  the  nearest  polynomial  with  a single  multiple  zero  are  now  joined  by  equa- 
tions for  the  nearest  polynomial  with  a complex  conjugate  pair  of  double  zeros  and 
ecuations  *o*-  the  nearest  polynomial  with  several  distinct  double  zeros.  All  these 
equations  have  simplified  forms  because  certain  Lagrange  nultiDliers  vanish  in  the 
complex  case.  But  some  examples  demonstrate  that  wren  only  real  perburbaticr.s  are 
considered,  the  Lagrange  multipliers  do  not  always  vanish.  Finally,  there  is  some 
theory  about  the  location  of  the  nearest  polynomial  with  a double  zero.  (over) 
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The  numerical  experiments  show  that  Newton's 
method  may  be  used  successfully  to  solve  the  equa- 
tions in  the  cases  of  greatest  interest  when  the 
expected  result  is  sufficiently  simple.  The  tech- 
niques may  also  be  applied  to  polynomials  such  as 
Wilkinson's  Tamous  example  whose  zeros  are  the 
integers  rrom  1 to  ?0.  But  then  the  numerical 
results  suggest  *hat  that  ill  conditioned  polyno- 
mial can  not  bf  explained  successfully  as  a small 
perturbation  of  a well  conditioned  polynomial. 
Instead  Wilkinson's  polynomial  lies  in  a region  of 
polynomial  space  whose  geometry  seems  to  be  excep- 
tionally complicated. 

Bounds  on  uncertainties  in  zeros  corresponding 
to  uncertainties  in  coefficients  are  customarily 
computed  with  Taylor  series.  For  ill  conditioned 
simple  zeros  these  Taylor  series  ho/e  radii  of  con- 
vergence that  are  much  too  small.  The  well  condi- 
tioned multiple  zeros  of  a nearby  polynomial  are 
not  amenable  to  Taylor  series  expansions  but  may  be 
expanded  in  a Puiseux  fractional  power  series. 

These  fractional  power  series,  however,  also  have 
unsatisfactory  regions  of  convergence.  But  by 
choosing  a different  starting  point  the  convergence 
problem  of  the  Puiseux  series  can  be  overcome  to 
produce,  in  principle,  series  that  converge  rapidl) 
throughout  the  region  of  interest.  In  practice 
those  series  are  used  to  produce  realistic  bounds 
on  the  uncertainties  in  the  zeros.  Full  exploita- 
tion of  these  techniques  awaits  adequate  facilities 
for  symbolic  algebra. 
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